Linux Shell Script to validate canonical tag
This simple linux shell script instructs how it can check a number of .html files and validate some contents within them. Use your favorite editor to create a file 'check_canonical':
user@server261583:/var/www/html# pico check_canonical
Remember to give it the executable permissions:
user@server261583:/usr/lib/s# chmod a+x check_canonical
Now, let's go through the example script step-by-step.
1. This is a bash script (note #!/bin/bash). For loops are easier with the bash than sh (#!/bin/sh)
2. NUMFILES gets the number of .html files that contain the <link rel="canonical" -tag. wc -l counts the lines, which equals the amount of files. The backslash '\' is used to escape certains characters.
3. Next, tmpoutput.txt file gets all the filenames that matched the tag.
4. The for loop goes through all the files, although it skips the index.html as it might we written as '/'.
5. The filename is picked from the tag. If it doesn't match, it prints out an error.
6. Eventually we delete the temporary file 'tmpoutput.txt'.
#!/bin/bash NUMFILES=`grep "\<link rel=\"canonical\"" *.html | wc -l` grep "\<link rel=\"canonical\"" *.html | cut -d':' -f1 > tmpoutput.txt for (( i = 1 ; i <= $NUMFILES; i++ )) do FILE=`cat tmpoutput.txt | sed -n "$i"p` if [ "$FILE" == "index.html" ]; then continue fi grep "\<link rel=\"canonical\"" "$FILE" | grep "$FILE" > /dev/null 2>/dev/null if [ $? -ne 0 ]; then echo "ERROR: File: " "$FILE" " has an incorrect canonical tag!" else echo "File: " "$FILE" "OK!" fi done rm tmpoutput.txt