How To Search In Code Site-Wide For Any Text





  • By  October 14th, 2015





    code-search


    Ever wondered if a crucial piece of text or code is present site-wide? Maybe some analytics, tracking, or tag manager code?


    Or how about when you need to find old email addresses, specific spelling errors or similar? This is where site-wide custom text search can help. With it you can find answers to questions like “which pages on my site are missing Google Analytics“, “how to find old Google Analytics code”, or “is Google Tag Manager placed at the right place on all pages”.


    A1 Website Analyzer

    One crawler tool that allows for custom search is our secret super tool A1 Website Analyzer. It can search in the full code of a page using regular expressions. Don’t know regular expressions? No worries; if your needs are simple, chances are you can simply write the text you are searching for or use one of the presets. But if you have complex needs, like finding variations of code blocks, regular expressions can be your savior.


    Learning the basics of regular expressions will be one of the most valuable things you can do as a web developer or even just as a geek user. Besides finding the things you need, advanced search and replaces and similar, many code libraries also contains functions that use regular expressions.


    If you already know regex or don’t care, you can skip right to the search tutorial itself.


    Regex

    regex


    When using regular expressions it is important to understand special characters have special meaning:



    • “.+” will match any character one to infinite times.
    • “.*” will match any character zero to infinite times.
    • “.*?” will match any character until the next part of the regular expression code can match something.
    • “s*” will match any whitespace character zero to infinite times.
    • “s+” will match any whitespace character one to infinite times.
    • “s” will match one whitespace character one time.
    • “[0-9a-zA-Z]” will match an English lowercase/uppercase letter or digit one time.
    • “[^<]*” will match any character except “<” zero to infinite times.
    • “(center|centre)” will match “center” or “centre”
    • “(center|centre)?” like above, but will continue with the next regular expression part even if no match

    Say we want to look for occurrences of the following text strings:



    • search engine peoples
    • Search Engine Peoples
    • Search Engine Professionals

    This regex can find any and all of these:


    (S|s)+earch (E|e)ngine (P|p)(rofessionals|eoples)


    For more information on regular expressions, try these resources:



    Code Search Tutorial

    In this demonstration, we’ll configure A1 Website Analyzer to search for two types of Google Analytics code throughout all pages it crawls.


    We first select the presets “ga_old” and “ga_new”:


    a1wa-presets-custom-search-popup


     


    When selecting them in the popup presets, they are automatically added to the dropdown list:


    a1wa-presets-custom-search-dropdown


    After we run the scan and inspect the results, we make sure to enable the column that shows custom search results.


    a1wa-data-column-custom-search


    This column will contain the results. Examples of how to read them:



    • Old and new analytics code found in the page:
      ga_old=1;ga_new=1
    • Old analytics code found once in the page:
      ga_old=1
    • Old analytics code found twice in th epage:
      ga_old=2

    Taking It Further

    Now is the time to insert your own regular expression search strings. Remember that from the presets you can see the format in A1 Website Analyzer is:


    “name=expression”


    This is because that besides the regular expression itself, A1 Website Analyzer also needs a “name” it can use for showing the site search results.


    When you have written your new regular expression, e.g.


    SEPMISSPELL=(S|s)+earch (E|e)ngine (P|p)(rofessionals|eoples)


    you can add it using the [+] button:


    a1wa-presets-custom-search-add


    Example Searches

    Some useful examples on how to add [+] searches for:


    Google Tag Manager Code


    If Google Tag Manager used in page:


    gt=<iframe src=”http://www.googletagmanager.com/


    Nofollow Present In Code


    If “nofollow” used in any page links:


    anf=<a [^>]*?rel=”?nofollow”?


    (Note: A1 Website Analyzer already has functionality to show links found on a page – this includes information such as “nofollow”)


    Frame Tag Used In Code


    If “frame” tags used in page:


    fra=<(iframe|frame)(s|>)


    Having learned above, you are now ready to initiate crawls of websites doing site-wide custom searches of just about anything!






    * Includes images from CyberHades, Pleuntje




    About the Author:





    My paid passion at Search Engine People sees me applying my passions and knowledge to a wide array of problems, ones I usually experience as challenges. People who know me know I love coffee.

    Ruud Hein


    How To Search In Code Site-Wide For Any Text
    The post How To Search In Code Site-Wide For Any Text appeared first on Search Engine People Blog.


    Search Engine People Blog

    (46)

    Leave a Reply

    This site uses Akismet to reduce spam. Learn how your comment data is processed.