Nicholas Chimonas, reporting for duty. Today’s news? Freedom. No limits. SEO. Spider. Crawler. Tool!
I performed a Google search for [spider robots doom], and that is what I ended up with. Thanks, robotmonkeys.net.
Until a gigantic robotic spider brings about my demise, *coughBostonDynamicscough* I’ll be enjoying Beam Us Up’s much less evil free SEO crawler.
A rare sighting of BUU in the wild:
283 characters on the meta description, eh? This is why we can’t have nice things, Star Wars. “Smuggler. Scoundrel. Hero. Han Solo, captain of the Millennium Falcon.” That’s better. I’m sending an invoice to LucasArts.
A lot of us have played around with Screaming Frog, Xenu, and Moz. The free versions of all those softwares have crawl limits, as you might expect. Such is the price of free.
However, Beam Us Up has no patience or pity for the restrictions of free. Myth has become reality. The first truly free non-restrictive SEO crawler has come to wreak havoc on the galaxy. It’s your PC versus Beam Us Up, and I know who comes out on top of that one: everybody! Everybody who owns a decent computer anyways.
Seriously though, “crawl limit” is purely dictated by the power of your computer. I’ve crawled thousands of pages with this beast. I even have a medal from a thrift store to prove it. “Chimonas - Champion Crawler” is scrawled in sharpie across a little piece of masking tape and everything. You don’t become a champion with restrictive SEO crawlers.
Anyways, the best part about this tool is how it reports SEO fails and aggregates offending pages into categories to be systematically addressed. Here are a few of the most useful ways that BUU has your back:
- WARNING: Meta.Description.Duplicate (#)
- ERROR: Title.Duplicate (#)
- WARNING: Title.Too_Long (#)
- WARNING: Title Too_Short (#)
- TO_NOTE: Meta.Description.Too_Long (#)
- WARNING: Page.Iframe (#)
- TO_NOTE: Status.301 (#)
- TO_NOTE: Page.Rel_Canonical_Same (#)
- TO_NOTE: Meta.Description.Missing (#)
The (#) will display exactly how many URLs are busting your chops. Click the warning message and only offending URLs will be displayed. At this point you can export the data to Excel or even directly to Google drive.
Gah! Who let Andy Warhol use the SEO crawler again?
For a full list of error filter descriptions, check out the official documentation.
BUU supplies you with the same URL level information that the top players in the game provide. These are the columns of data you will receive:
- Status Code
- Title Length
- Meta Description
- Description Length
- Meta Keywords
- Keywords Length
- H1 Length
- H2 Length
- H3 Length
- Meta Robots
- Canonical URL
- Redirect Location
That data is spicier than my favorite nectar, Theobroma by Dogfish Head. Remember, it’s very dangerous to perform an SEO audit without any beer. You could become bored, confused, frustrated… Audit at your own risk. So crack a cold one, and allow me to dump my brain knowledge into your neurons and stuff. Seeing as how I’m a man of links, I’ll drop you a tip on how you might utilize this crawler to recover lost link equity.
If you like coffee and cute robots, you may have seen a video or two from the Moz folks. Specifically this one by Cyrus Shepard, which covers how to use Moz’s suite with Open Site Explorer to recover lost link equity from pages which have been 404’d. If you’ve got the Moz suite, sweet. However, if you want to do things the free way, read on. Quick note: You will need a paid subscription to Majestic in order to use their API with SEOTools. Otherwise, a free backlink data gathering option is listed as an alternative.
The base concept is to identify pages that earned links, but were later removed, resulting in a loss of link equity since those links now point to a 404 page. These are links you’ve already won by merit, but lost through.. er.. mistakes. Follow the 7 steps below and you'll be rich with the recovery of long lost links.
Step One: Download BUU from Beam Us UP
This is the first thing you’ll see once opening the link to BUU’s site. Make sure you click “download”, or else you won’t have BUU on your computer. Following the rest of this tutorial will be all for naught if you do not properly execute this first step. You’ve been warned.
Step Two: Crawl your site
No installation is required. Unless you do not have Java, in which case, you'll need to install Java. Presuming you do have Java, once you have the .exe downloaded, simply run the file, input the domain you'd like to crawl, and click start.
Step Three: Export in the format of your choosing
I choose XLS, and you should too. Although you could export directly to Google drive, which is a pretty handy feature if you're going to be tag teaming this spreadsheet with a friend. Make sure you’ve clicked “Show all URLs Found” before you click export.
Step Four: Navigate to the “status.error” tab
You’ll see all of the pages throwing a 404 status code. Beam Us Up has already filtered them out for us in the export. So. Cool.
Step Five: Download referring domain linking data
With a paid subscription to Majestic, you can utilize an API to acquire the data for the URL list of 404 pages directly through Excel, via the SEOTools plugin. Otherwise, use your backlink tool of choice and manually check each 404 page for backlinks. WebMeUp is a free backlink tool if you want to keep the freedom theme going.
SEOTools, Side Walkthrough
If you haven’t performed this level of advanced black magic before, you may want to read Distilled's Guide to Excel for SEO. P.S: If you are running Excel 2013, you’ll need the newest beta version of SEOTools.
Onward to the wizadry:
- Download and unzip SEOTools, and then open the XLL Add-In named “SeoTools”
- Login to Majestic through Excel: First click on the “SeoTools” tab at the top of your sheet, and then click MajesticSEO. This video walkthrough can help if you get lost.
- Once you’re logged in, copy all of the URLs from column A and paste them into the “Url(s)” box.
- In the “Fields” box, uncheck everything except “RefDomains”
- Click on the cell in row 1 of column D, this is where your referring domain data will be inserted
- Click “Insert”
Step Six: Sort 404 pages by number of linking root domains
The 404 pages with the highest number of LRDs are the ones you should work on first. Highlight all of column D, click the “DATA” tab at the top of your sheet, and just under that click the button “ZA--->” highlighted in the image below to sort from largest to smallest.
In Linkarati’s example, all of the 404 pages only have 2 domains linking to them, so their priority is equal. That's because our site is super awesome and cared for with love. However, I have performed this exact technique on real life clients too. Some have had hundreds of 404 pages and thousands of lost links. Recovering lost link equity can be a real gold mine.
Step Seven: Recover your lost link equity!
You now have 3 options:
- 301 the 404 page to a relevant, up to date page
- Recreate the page the links are pointing to
- Outreach to the webmasters of the linking root domains pointing links to your broken 404 page, with a request to change the destination URL
Once you’ve completed any of those 3 steps for each 404 page, the only thing left to do is
“The Communist Party”, by Tom Burns - Threadless