John Lincoln

How To Use Siteliner To Quickly Track Duplicate Content

How To Use Siteliner To Quickly Track Duplicate Content

How would you like to quickly check for duplicate content on your site? If so, then have a look at Siteliner.

Siteliner is a web crawler. Although it’s billed as a tool that checks for duplicate content, it will also find other issues on your website, such as broken links.

Bonus: the tool is brought to you by the same folks who created Copyscape. So it’s pretty good at tracking duplicate content.

In this guide, I’ll go over Siteliner so you can decide if it’s best for your business.

It’s Web-Based

Siteliner runs in the cloud. That means you don’t have to download anything to get it up and running.

Other web crawlers, like Screaming Frog require a software download and install.

All you need to do to get the ball rolling is head over to the Siteliner website and plug your home page URL into the field in the middle of the screen. Then, click Go.

Siteliner will crawl all over your site looking for duplicate content. It will take a few minutes, depending on the number of pages.

Keep in mind: if you’re using the free version of Siteliner, you’re limited to a maximum of 250 pages. If you want to crawl more than that, you’ll have to upgrade to the premium service (read: $$).

Siteliner Premium

Also, you can only crawl a single domain once per month with the free version. With the premium service, you can crawl as often as you want.

The Report

When Siteliner is finished crawling your site, you’ll get a report with the following info:

  • Page Data – Key information about each page, such as its URL, title, size, number of words, and date modified.
  • Duplicate Content – Content that matches the content of another URL on your site.
  • Broken Links – Links to a page on your site that has since moved or is no longer accessible.
  • Skipped Pages – Pages that the tool skipped when it crawled your site.
  • Related Domains – Subdomains on the site you scanned.

There’s a lot to unpack in each part of the report so let’s break them all down in detail.

Siteliner report

Page Data

Page data not only shows you key info about the pages that Siteliner scanned, it also gives you a metric for each page called Page Power.

Page Power shows you how prominently the page might appear to search engines that crawl your site.

Why would some pages appear more prominent than others? Because they have more links pointing to them than other pages on the site.

In other words, Siteliner looks at internal linking.

You can use the Page Power data to remove pages that aren’t relevant any longer. Alternatively, you might want to update them with more timely info.

The Page Power metric can also give you actionable insight on good internal linking strategies. Remember, you should link to the pages that are most important to your brand.

Duplicate Content

This is where Siteliner highlights identical content found on two or more pages on your site.

duplicate content report

Note: Siteliner distinguishes a URL that ends with a slash (“/”) as different than the exact same URL that doesn’t end with a slash.

So the report might show you that you have 100% duplicate content on two URLs but they’re really the same URL. One ends with a slash and the other one doesn’t.

The solution to that is to set a rel=canonical on one of the pages. Siteliner reads that meta tag and won’t flag the content as duplicate.

Further, the tool lists the URLs it scanned in a table. Just click on any row in the table to see the duplicate content for a specific page.

Oh, by the way: when you look at the duplicate content section of the report, you might see some of the same terminology you’ve seen when using Copyscape.

That’s for a very good reason. Remember, Siteliner is produced by the same people who made Copyscape.

Broken Links

It’s easy to make changes on your site and forget to update links accordingly. Fortunately, Siteliner will find broken links that could hurt your SEO efforts.

If the tool identifies a broken link, either create a page that redirects from that link to the right place or update the broken link on the appropriate page.

Broken links report

Skipped Pages

Siteliner doesn’t scan everything. That’s for a very good reason.

For example, if you’re using the aforementioned rel=canonical on your site, the tool will just look at the target in the meta tag.

The whole point of the rel=canonical tag, after all, is to ensure that certain pages don’t get flagged for duplicate content. But they also don’t get indexed.

Also, if there’s a meta tag redirection, Siteliner will follow the redirection and skip the underlying page.

The tool also skips pages in the event of a HTTP header redirection, a frame redirection, or if it’s prohibited from crawling the page by robots.txt.

In any case, you’ll find a list of skipped pages on the Siteliner report.

Related Domains

If you’ve got subdomains, Siteliner will find those and identify them.

However, it won’t crawl them.

If you’d like the tool to crawl those domains, just click on the appropriate row. Siteliner will set up a new scan for that subdomain.

Keep in mind: the www prefix is technically considered a subdomain. It should produce the exact same results as the URL without the www prefix, but it’s worth your while to check.

Downloads

Yes, you can download the Siteliner report. As of now, it’s only available in CSV format.

To download the report, just click on the download link on the left-hand side of the report.

You may also choose to download only a specific table on the screen. Once again, that file is available only in CSV format.

Finally, Siteliner produces an XML Sitemap for you. You can upload that to your site and use it as your sitemap for search engines.

The Cost

So how much does Siteliner cost if you decide to go premium?

That’s an easy question to answer. Each page only costs 1 penny.

So if you have 1,000 pages on your site and you want them all scanned, that’s going to cost you $10.

Siteliner Premium Purchase Credits

You can buy premium credits via credit card or PayPal.

Wrapping It Up

Siteliner is a nifty little tool that will help you identify duplicate content on your site. As a bonus, it will find broken links and give you insights about pages that stand out the most.

If you haven’t yet taken Siteliner for a free test drive, why not do so today?

Leave a Reply

Your email address will not be published. Required fields are marked *

Welcome To John Lincoln Marketing

Welcome to John Lincoln’s personal website. You can learn about John Lincoln’s books, films, book him to speak and contact him. John is directly associated with many of the businesses mentioned on this website and freely discloses this information. 

About the Author

John Lincoln is CEO of Ignite Visibility, one of the top digital marketing agencies in the nation. Ignite Visibility is a 6x  Inc. 5,000 company. Ignite Visibility offers a unique digital marketing program tied directly to ROI with a focus on using SEO, social media, paid media, CRO, email and PR to achieve results. Outside of Ignite Visibility, Lincoln is a frequent speaker and author of the books Advolution, Digital Influencer and The Forecaster Method. Lincoln is consistently named one of the top digital marketers in the industry and was the recipient of the coveted Search Engine Land “Search Marketer of The Year” award. Lincoln has taught digital marketing and Web Analytics at the University of California San Diego since 2010, has been named as one of San Diego’s most admired CEO’s and a top business leader under 40. Lincoln has also made “SEO: The Movie” and “Social Media Marketing: The Movie.” His business mission is to help others through digital marketing.

Contact John Lincoln

Want to get in touch with John Lincoln? Click Here To Reach Out.

Related Posts