The READIN Family Album
(April 19, 2002)

READIN

Jeremy's journal

One must write in a tongue which is not one's mother tongue

Vicente Huidobro


(This is a page from my archives)
Front page
More recent posts
Older posts
More posts about:
The site
Programming Projects
Projects

Archives index
Subscribe to RSS

This page renders best in Firefox (or Safari, or Chrome)

🦋 Filtering out page reads

You get a lot of stuff in your web server log file that does not have to do with actual human reads of your site. I wrote a script that I think shows all the human page views in an Apache log file. It relies on that browsers get css stylesheets, while robots generally don't. (It will miss humans using Lynx; it could easily be tweaked to fix that enough. Also, I have seen Yahoo getting css files; you can fix that by putting "Slurp" in the list of files you're not interested in.)

grep  "blog.css" $logfile  | // get all reads 
                                of blog.css
        awk '{print $1;}' |  // extract ip address
        sort | uniq |        // only show each ip once
        grep -f - $logfile | // now pass that list 
                                of ip's back to grep
        grep " 200 " |       // only show successful reads
        egrep -v (any files you're not interested in)

I believe you could also use "favicon.ico" instead of your css file, but this is less reliable -- I don't know how often browsers request favicon for sites they have already visited. Or you could use the filename of a graphic included on one of your pages and hosted on your site, I think this would work reasonably well.

posted evening of Tuesday, November 20th, 2007
➳ More posts about The site
➳ More posts about Programming Projects
➳ More posts about Projects

Respond:

Name:
E-mail:
(will not be displayed)
Link:
Remember info

Drop me a line! or, sign my Guestbook.
    •
Check out Ellen's writing at Patch.com.

What's of interest:

(Other links of interest at my Google+ page. It's recommended!)

Where to go from here...

Friends and Family
Programming
Texts
Music
Woodworking
Comix
Blogs
South Orange
readinsinglepost