If it hadn’t been for my psycho dentist and an operation to fix the aftermath of a tooth extraction I would have made this post sooner. I would have used St Patrick’s Day as an excuse but I don’t take holidays and I know I’m not alone on that one.
On the subject of checking for broken links and deleting bad web pages I just want to give short coverage on putting your 404 Not Found page to good use.
Too many people forget that a 404 page is just as important as any other page on a website, why is it important? Simply because visitors still hit your 404 page so it’s better to serve them something useful rather than letting them hit a brick wall on your site.

The above screen shot is a typical example of a piss poor 404 Error page on a MS IIS Server. Believe it or not this screen shot is taken from a so called expert on-line marketing company based here in Ireland, tut tut Continuum.
Keep It Simple
If you search online for 404 Error pages (or something similar) you’ll find lots of advice on the best methods to use in serving a 404 error page. Probably the best advice is to keep it simple.
Tell the visitor the page is no longer available, give them the option to contact you but also direct them to either your homepage or sitemap and even have a search box for them to use. That seems to be the most popular advice that can be found on the web.
A 404 Error page can also be used to earn extra revenue but I’ll come back to that another day.
March 20th, 2008
If you haven’t figured it out yet this is a serious of posts continuing from where I left off back in November, after deleting over 140 blog posts.
With that being said, how do you delete your web pages or blog posts?
I’m a firm believer that simply deleting or removing web pages isn’t as simple as deleting it from the web server and walking away. A few minor adjustments need to be made, so that everyone & bots know exactly what has happened to the content.
One way of letting the search bots know what we’ve done with certain content is by making use of HTTP Status Codes such as:
- 200 OK
- 301 Moved Permanently
- 302 Found
- 304 Not Modified
- 307 Temporary Redirect
- 400 Bad Request
- 401 Unauthorized
- 403 Forbidden
- 404 Not Found
- 410 Gone
- 500 Internal Server Error
- 501 Not Implemented
People use all sorts of whacky configurations when a web page is deleted from the server but the correct status code to serve would be the 410 Gone error. This is debatable and not widely practised (more on this later).
Never use 301 Moved Permanently On Deleted Pages Or Error Pages
Its very common practice out there for people to redirect their error pages back into their home page, bad idea and here’s why:
- Its misuse of the HTTP status codes
- Telling bots content has moved when it actually hasn’t (confusing them?)
- Confused bots can do horrible things
- It could confuse people on why the page they requested has redirected back to the home page
OK so some over exaggeration there but hopefully you get my point on why it could be a bad idea using a 301 redirect on an error page.
Make Sure Not To Serve 200 OK Status Code For Deleted Content
As the status code says, 200 OK. This is the code for content being found on the web server. One common mistake is people failing to configure their error pages correctly so when a deleted page is requested and the error page is served it comes back with a 200 OK, this confirms the content is still there when its not. One way to check this is to use a tool such as HTTP Status Codes Checker and see what your error page displays by entering a false URL on your domain. There is plenty of information on-line regarding this issue, I particularly like this GSitecralwers article.
The 410 Gone Status Code
The requested resource is no longer available at the server and no forwarding address is known.
The 404 Not Found Status Code
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent.
As you clearly see the above explains the exact difference between a 404 error & a 410, so why are most people using a 404 status code on a URL that has been deleted forever? Ah Shite, why am I even doing it?
Google Engineer Matt Cutts states:
So many webmasters misuse 404 vs. 410 that I don’t expect we’ll distinguish between them any time soon.
Right now I’m staring at 16 URL’s serving the 404 (Not found) status code. These are remainders of the blog posts that I had deleted late last year. The last recent date Google requested these URL’s is Feb 21, 2008 so they are still coming back looking for them, exactly how they should considering I haven’t indicated if it was a temporary or permanent deletion.
So will a 410 Error code stop Google from requesting those URL’s? I know I can use Google Webmaster Tools to remove URL’s from Google’s database but I’m more interested in the 410 Gone method for now and from Matt’s comment and other comments there are not going to be treated differently.
Mark Pilgrim gives a brilliant guide and interesting discussion on HTTP Error 410: Gone, even if it is a 5 year old blog post.
I’ve been looking at few sites seeing how people use the status codes and what information they provide for non-existent URL’s. It’s interesting to see how people pay very little attention to their own error pages which will lead me onto a follow up post on putting your error pages to good use, more specifically a 404 Not Found page. Take a look at mine, its very boring.
I know this is a very old topic but its something that has been sparking my interest since I’ve deleted my previous blog.
February 27th, 2008
How often do you check for broken links within your website? If you don’t check for them then I think you are mad and why would I think that?
Most, if not all, sites/blogs require maintenance checks and this would include checking for broken links. Some people check them at different intervals but if my site or blog is constantly changing I prefer to run spot checks quite regularly, at least once or twice a month.
You can use a free tool such as Link Sleuth that can quickly scan web pages and come back with a status code report on the links. You can find a list of status code definitions that may appear on the report here.
Why bother checking for broken links?
When (not if) a broken link appears on your site or blog it effects 3 things, your visitors, the page/site you are linking to and the search engine bots.
The Visitor
We’ve all been there before. We run a search, found related content, read the content, and clicked a link within the page but only to find a dead URL that leaves us in a state of disappointment. It’s actually a personal hate of mine, clicking on a broken link that is, and I’m sure many other internet users hate it also. Having broken links on our sites isn’t visitor friendly so it’s the most important factor why we should be checking for them regularly.
The Page Or Site Being Linked
Links are the glue of the internet, without them most pages wouldn’t be found on the internet. If we have broken links on our sites we are not helping other pages & sites be more easily found so again, it’s important to check.
The Search Bots
When search engines bots crawl the web they use links to hop from one page to other or from one site to another, as I said links are on-line glue. It’s not being search engine friendly when links are broken and the bots are unable to continue the path you’ve laid out on your site. Having broken links on a page may or may not (assuming here) affect the search engine rankings for that page but they would certainly lower the quality score in regards to both visitors (again) and search engines.
If you haven’t done so already download Link Sleuth and run it. It’s very simple to use and yet very effective for finding broken links. The bigger your site or blog is the longer the crawl will take, be patient.
February 25th, 2008