I think ... - static-sitehttps://blog.kmonsoor.com/2017-01-07T00:00:00+06:00Pelican Static sites - SEO Optimization2017-01-07T00:00:00+06:002017-01-07T00:00:00+06:00Khaled Monsoortag:blog.kmonsoor.com,2017-01-07:/pelican-how-to-make-seo-friendly/<p>Usually <code>Pelican</code> static-site generator is not very concerned about <span class="caps">SEO</span> of the generated site. Related themes and their templates also don’t take it much seriously. But you shouldn’t loose <span class="caps">SEO</span>, right?</p><p>Writing is a hard job, especially it is your hobby beside of your day job. But what’s the benefit if nobody reads it just because they couldn’t find it.</p>
<p>Usually <code>Pelican</code> <a href="https://github.com/getpelican/pelican">static-site generator</a> is not very concerned about <span class="caps">SEO</span> of the generated site mainly because it’s not that focused on commercial usage. That’s what I felt.
Related themes and their templates also don’t take it much seriously. But you shouldn’t loose <span class="caps">SEO</span> just because you migrated from Wordpress or whatever you were using previously, right?</p>
<p>Often times theme-authors focus more on look-n-feel of the theme, but not so much on the <span class="caps">SEO</span> concerns.</p>
<p>That’s this blog about; let’s fix that.</p>
<h4 id="note-1-about-other-static-site-generators">Note 1: About other static-site generators<a class="headerlink" href="#note-1-about-other-static-site-generators" title="Permanent link">¶</a></h4>
<p>Although the following discussions <span class="amp">&</span> codes mostly are specific to <a href="https://github.com/getpelican/pelican">Pelican</a> templates which uses <a href="http://jinja.pocoo.org/">jinja2</a> templating language, the concepts and concerns here are applicable to most static-site generators and their themes.</p>
<h4 id="note-2-im-in-no-way-an-seo-expert">Note 2: I’m, in no way, an <span class="caps">SEO</span> expert<a class="headerlink" href="#note-2-im-in-no-way-an-seo-expert" title="Permanent link">¶</a></h4>
<p>This writeup is just a collection of my findings while correcting my blog’s <span class="caps">SEO</span> course; fixing the stupid mistakes. Also, this isn’t a commercial site. There are lot more and in-depth aspects of <span class="caps">SEO</span> optimization other than the following, that can be very important for commercial projects.</p>
<h2 id="getting-started">Getting started<a class="headerlink" href="#getting-started" title="Permanent link">¶</a></h2>
<h3 id="lookup-for-missing-pieces">Lookup for missing pieces<a class="headerlink" href="#lookup-for-missing-pieces" title="Permanent link">¶</a></h3>
<p>Make sure all the linked resources(links, images, <span class="caps">CSS</span> <span class="amp">&</span> <span class="caps">JS</span> files) that you’ve used in your pages are valid. For that check browser’s <code>console</code> in <code>Developers tools</code> for any errors e.g. unavailable urls, faulty html/css etc.</p>
<h3 id="know-the-critical-spots">Know the critical spots<a class="headerlink" href="#know-the-critical-spots" title="Permanent link">¶</a></h3>
<ul>
<li><strong>pelicanconf.py</strong> - It’s usually in your root folder of the site.</li>
<li><strong>base.html</strong> - It’s in the <code>templates</code> folder of the theme folder that you are using. Changes here will impact all <span class="caps">HTML</span> pages generated by Pelican.</li>
<li><strong>article.html</strong> - In same folder as <code>base.html</code>. Changes here will impact only articles’ pages.</li>
</ul>
<h2 id="avoiding-duplication">Avoiding Duplication<a class="headerlink" href="#avoiding-duplication" title="Permanent link">¶</a></h2>
<h3 id="avoid-your-source-getting-indexed-by-google">Avoid your source getting indexed by Google<a class="headerlink" href="#avoid-your-source-getting-indexed-by-google" title="Permanent link">¶</a></h3>
<p>If the source of your blog is not sourced-open, meaning the content is not in a open-sourced repo, this isn’t your concern.
But if it is, it should be a concern. </p>
<p>The reason is that <code>Github.com</code>(or where your repo is hosted on) is stronger domain than yours, so Google “sees” that your site-contents (even though it’s yours) also is on Github. So, there’s a high chance that it’ll mark your contents as “duplicate”. And, duplicate contents gets heavy hammers from Google’s <span class="caps">SEO</span> point-of-view. Searching any writeup from yours, even it’s very unique on Internet, Google search will show both your site and the repo as well, possibly links from your site will be on the lower side.</p>
<p>It begs the question how to avoid that. It’s not complicated.</p>
<p>For example, Github.com’s <a href="https://github.com/robots.txt">robot.txt</a> allows Google (or any search engine for that matter) to index only the <code>master</code> branch of any open-source repo. So, if your repo don’t have any branch named <code>master</code>, it won’t be indexed. That’s it. Rename your “master” as “live”, “main”, “production” or whatever you feel like.</p>
<ol>
<li>Create a new “master”</li>
</ol>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>$ git checkout -b new-master
<span class="linenos" data-linenos="2 "></span>$ git push -u origin new-master
</code></pre></div>
<ol start="2">
<li>Tell Github about new move</li>
</ol>
<p><img alt="change repo-settings on Github" src="http://i.imgur.com/wjf6zwul.png"></p>
<ol start="3">
<li>Now, delete <code>master</code> branch from your repo.</li>
</ol>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>$ git branch -d master
<span class="linenos" data-linenos="2 "></span>$ git push origin :master
</code></pre></div>
<h3 id="utilize-relcanonical-link">Utilize <code>rel="canonical"</code> link<a class="headerlink" href="#utilize-relcanonical-link" title="Permanent link">¶</a></h3>
<p>This is kind of a must-do for avoiding being marked as duplicate content.
This will also defend you against automatic content-scraping schemes by always having a pointer to your original source. </p>
<p>To utilize it on Pelican, add or make sure that a <code><link></code> element with the attribute <code>rel="canonical"</code> to the <code><head></code> section of your base template named <code>base.html</code>.</p>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>{% if article %}
<span class="linenos" data-linenos="2 "></span><link rel="canonical" href="{{ SITEURL }}/{{ article.url }}"/>
<span class="linenos" data-linenos="3 "></span>{% endif%}
</code></pre></div>
<p>For example, if you see this page’s source (pressing <span class="caps">CTRL</span>+u), you should see something like</p>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span><link rel="canonical" href="https://blog.kmonsoor.com/pelican-how-to-make-seo-friendly/"/>
</code></pre></div>
<h3 id="proper-title-of-each-page">Proper <strong>< title ></strong> of each page<a class="headerlink" href="#proper-title-of-each-page" title="Permanent link">¶</a></h3>
<p>Every page on your site should have a proper title.
For search engines, it represents the page. It should concisely reflect a page’s content.
But try to keep it less 60 characters or search-engines may choose to truncate it. Use each characters wisely.</p>
<p>It may look like this in your <code>base.html</code>.</p>
<p><div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>{% if article %}
<span class="linenos" data-linenos="2 "></span><title>{{ article.title }} -- {{ TAGLINE }}</title>
<span class="linenos" data-linenos="3 "></span>{% else %}
<span class="linenos" data-linenos="4 "></span><title>{{ TAGLINE }}</title>
<span class="linenos" data-linenos="5 "></span>{% endif%}
</code></pre></div>
The <strong>else</strong> clause here is to ensure that non-article pages also get a title, even it’s just your <strong>tagline</strong> defined in <code>pelicanconf.py</code>.</p>
<h3 id="meta-descriptions">Meta descriptions<a class="headerlink" href="#meta-descriptions" title="Permanent link">¶</a></h3>
<p>Include a meta-description to be added on each page of your site.
Though it may don’t directly hit <span class="caps">SEO</span> ranking, but it appears as a snippet on the search page.
So, user should get a proper glimpse of what your page gonna talk about.
Make sure your theme uses Pelican’s <code>summary</code>-tagged text for this purpose. Else, ensure it yourself by editing <code>base.html</code>.</p>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>{% if article and article.summary %}
<span class="linenos" data-linenos="2 "></span><meta name="description" content="{{ article.summary|striptags }}"/>
<span class="linenos" data-linenos="3 "></span>{% else %}
<span class="linenos" data-linenos="4 "></span><meta name="description" content="{{ SITE_SUMMARY }}"/>
<span class="linenos" data-linenos="5 "></span>{% endif%}
</code></pre></div>
<h3 id="use-search-console-extensively">Use <code>search-console</code> extensively<a class="headerlink" href="#use-search-console-extensively" title="Permanent link">¶</a></h3>
<p>Google’s <a href="https://www.google.com/webmasters/tools/home">Search-console, previously known as webmaster-tools</a> is your friend. Utilize it as far as you can go.</p>
<h4 id="extensively-use-pagespeed-insights">Extensively use <a href="https://developers.google.com/speed/pagespeed/insights/">PageSpeed Insights</a><a class="headerlink" href="#extensively-use-pagespeed-insights" title="Permanent link">¶</a></h4>
<p>To understand where are current bottlenecks of your site, this tools gives quite a lot insights. Address those one-by-one.</p>
<h4 id="set-preferred-version-of-your-site">Set preferred version of your site<a class="headerlink" href="#set-preferred-version-of-your-site" title="Permanent link">¶</a></h4>
<p>If you have <code>www</code>, <code>http</code> and <code>https</code> versions of your site, tell Google here which one is preferred. It’s only applicable to your domain-root. Once applied and Google re-indexed your site, all the search-results from your site will show that preferred version of your site.</p>
<p><img alt="setting preference for www or non-www version" src="http://i.imgur.com/51JY1oel.png"></p>
<p>You have to add and do it same both for <code>http</code> and <code>https</code> version of your site, if you have both.</p>
<p>If you have both, <strong>either</strong> you can use a javascript-code snippet in the <code><head></code> of <code>base.html</code> to redirect any <code>http</code> page to its <code>https</code> counterpart.</p>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span><script type="text/javascript">
<span class="linenos" data-linenos="2 "></span> var host = "your-site.com";
<span class="linenos" data-linenos="3 "></span> if ((host == window.location.host) && (window.location.protocol != "https:"))
<span class="linenos" data-linenos="4 "></span> window.location.protocol = "https";
<span class="linenos" data-linenos="5 "></span></script>
</code></pre></div>
<p><strong>or</strong>, if your site is served through <span class="caps">NGINX</span> or Apache, you can do it through site’s <code>.htaccess</code> file, by adding the following.</p>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>RewriteEngine On
<span class="linenos" data-linenos="2 "></span>RewriteCond %{HTTPS} !on
<span class="linenos" data-linenos="3 "></span>RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
</code></pre></div>
<p><strong>Or</strong>, if you are using CloudFlare <span class="caps">CDN</span>, you can create a page-rule for that as I have shown below.</p>
<p><img alt="always-https by CloudFlare page-rules" src="http://i.imgur.com/9ISFbtvm.png"></p>
<h4 id="check-index-status">Check index-status<a class="headerlink" href="#check-index-status" title="Permanent link">¶</a></h4>
<p>Once in a while Check for Google’s index status of your site on the <code>search-console</code>. Look for error messages or suggestions.</p>
<p>After every major change in your site’s structure, make sure Google bots “know” about it. You can somewhat expedite the process by <a href="https://www.google.com/webmasters/tools/submit-url">manually submitting</a> your site.</p>
<h3 id="include-opengraph-data">Include <code>OpenGraph</code> data<a class="headerlink" href="#include-opengraph-data" title="Permanent link">¶</a></h3>
<p>Make sure each of your pages is including proper <a href="http://ogp.me/">OpenGraph</a> tags, e.g. <code>og:title</code>, <code>og:content</code> etc., in your template.</p>
<p>Though, OpenGraph originated from facebook Inc., these tags are now widely used by other social engines, even by Google+. In absence of Twitter tags, Twitter also uses these <code>og</code> tags. Try to include both <code>og:</code> and <code>twitter:</code> tags. Proper data in these tags makes your article cleanly-sharable in these social sites.</p>
<p>The below snippet that <a href="https://github.com/kmonsoor/blog.kmonsoor.com/blob/pelican-how-to-make-seo-friendly/plumage/templates/base.html">I use myself</a> can serve as a starting point.</p>
<div class="highlight"><pre><span></span><code><span class="linenos" data-linenos=" 1 "></span><!-- OpenGraph protocol tags: http://ogp.me/ -->
<span class="linenos" data-linenos=" 2 "></span><!-- originally adopted to be used for: https://blog.kmonsoor.com -->
<span class="linenos" data-linenos=" 3 "></span><meta property="og:site_name" content="{{ SITENAME }}" />
<span class="linenos" data-linenos=" 4 "></span><meta property="og:type" content="article" />
<span class="linenos" data-linenos=" 5 "></span>{% if article and article.title %}
<span class="linenos" data-linenos=" 6 "></span><meta property="og:title" content="{{ article.title }} -- {{ TAGLINE }}" />
<span class="linenos" data-linenos=" 7 "></span><meta property="og:url" content="{{ SITEURL }}/{{ article.url }}" />
<span class="linenos" data-linenos=" 8 "></span>{% endif%}
<span class="linenos" data-linenos=" 9 "></span>{% if article and article.summary %}
<span class="linenos" data-linenos="10 "></span><meta property="og:description" content="{{ article.summary|striptags }}" />
<span class="linenos" data-linenos="11 "></span>{% else %}
<span class="linenos" data-linenos="12 "></span><meta name="og:description" content="{{ SITE_SUMMARY }}"/>
<span class="linenos" data-linenos="13 "></span>{% endif%}
<span class="linenos" data-linenos="14 "></span>{% if article and article.date %}
<span class="linenos" data-linenos="15 "></span><meta property="article:published_time" content="{{ article.date }}" />
<span class="linenos" data-linenos="16 "></span>{% endif%}
<span class="linenos" data-linenos="17 "></span>{% if article and article.modified %}
<span class="linenos" data-linenos="18 "></span><meta property="article:modified_time" content="{{ article.modified }}" />
<span class="linenos" data-linenos="19 "></span>{% endif%}
<span class="linenos" data-linenos="20 "></span><!-- End of OpenGraph protocol tags -->
<span class="linenos" data-linenos="21 "></span>
<span class="linenos" data-linenos="22 "></span>{% if TWITTER_USERNAME %}
<span class="linenos" data-linenos="23 "></span><meta name="twitter:site" content="@{{ TWITTER_USERNAME }}" />
<span class="linenos" data-linenos="24 "></span><meta name="twitter:creator" content="@{{ TWITTER_USERNAME }}" />
<span class="linenos" data-linenos="25 "></span>{% endif%}
<span class="linenos" data-linenos="26 "></span><meta name="twitter:image" content="INSERT-YOUR-TWITTER-IMAGE-LINK" />
<span class="linenos" data-linenos="27 "></span>{% if article and article.summary %}
<span class="linenos" data-linenos="28 "></span><meta name="twitter:card" content="{{ article.summary|striptags }}" />
<span class="linenos" data-linenos="29 "></span>{% else %}
<span class="linenos" data-linenos="30 "></span><meta name="twitter:card" content="{{ SITE_SUMMARY }}"/>
<span class="linenos" data-linenos="31 "></span>{% endif%}
</code></pre></div>
<h4 id="notes">Notes<a class="headerlink" href="#notes" title="Permanent link">¶</a></h4>
<ul>
<li>Grab your own Twitter’s avatar link do the following:</li>
<li>Go to your Twitter profile page</li>
<li>Right-click on your profile picture</li>
<li>
<p>Select “Copy image address” / “Copy image link”</p>
</li>
<li>
<p>For <code>OpenGraph</code> tags you may also consider to use <a href="https://github.com/whiskyechobravo/pelican-open_graph/tree/master">pelican-opengraph</a> plugin.</p>
</li>
<li>
<p>For all these to work properly, make sure <code>SITEURL</code>, <code>TAGLINE</code>, <code>SITE_SUMMARY</code>, <code>TWITTER_USERNAME</code> are properly defined in your <code>pelicanconf.py</code> alongwith in <code>publishconf.py</code> files. Please remember that definitions in <code>publishconf.py</code> only apply when you using <code>make publish</code> command.</p>
</li>
</ul>
<h3 id="loading-performance">Loading Performance<a class="headerlink" href="#loading-performance" title="Permanent link">¶</a></h3>
<h4 id="compress-everything">Compress everything<a class="headerlink" href="#compress-everything" title="Permanent link">¶</a></h4>
<ul>
<li>PageSpeed impacts <span class="caps">SEO</span> directly. Google punishes slow-site especially when search is made on a mobile device. Mobile-optimized sites will definitely rank higher on searches from mobile-devices.</li>
</ul>
<p>So, make sure all static files are compressed. If not, compress your themes theme’s <span class="caps">JS</span>, <span class="caps">CSS</span> files yourself to a <em>.min.</em> version and then replace those in the template files of the theme.</p>
<p>Or, better to use <a href="https://github.com/getpelican/pelican-plugins/tree/master/gzip_cache">gzip_cache</a> for gzipping all the <span class="caps">HTML</span> files statically, also and <a href="https://github.com/getpelican/pelican-plugins/tree/master/yuicompressor">yuicompressor</a> plugin for compressing <span class="caps">JS</span> <span class="amp">&</span> <span class="caps">CSS</span> files for Pelican. Those will make sure that, upon build, everything is compressed.</p>
<h4 id="utilize-cdn-if-you-can">Utilize <span class="caps">CDN</span> if you can<a class="headerlink" href="#utilize-cdn-if-you-can" title="Permanent link">¶</a></h4>
<ul>
<li>
<p>Use <span class="caps">CDN</span>-ed versions of common libraries(e.g. jQuery, Bootstrap etc.) rather than hosting your own copy, unless your theme actively modified it. Look it up on <a href="cdnjs.cloudflare.com">CloudFlare cdnjs</a>, <a href="https://cdnjs.com">cdnjs</a>, or on <a href="https://www.jsdelivr.com">jsdelivr</a> etc. and use those links.</p>
</li>
<li>
<p>Try to use a <span class="caps">CDN</span> for edge-distribution of your site. I only know of CloudFlare that provide this service for free for a single site. There might be others. <span class="caps">CF</span> also make managing <span class="caps">DNS</span> configuration little breezy.</p>
</li>
</ul>
<h3 id="engage-commenting">Engage commenting<a class="headerlink" href="#engage-commenting" title="Permanent link">¶</a></h3>
<p>While serving a static site, integrating a commenting-system looks a little far-fetched. However, blogs without proper commenting system feels kinda lame sometimes. Of course, <span class="caps">YMMV</span>.</p>
<p>But, it’s not difficult; easily can be done by systems like <a href="https://disqus.com/">Disqus</a> etc. I’m not affiliated with them, by the way.</p>
<h3 id="host-images-separately">Host images separately<a class="headerlink" href="#host-images-separately" title="Permanent link">¶</a></h3>
<p>Host all the images separately that you’ve used in your articles. Use image-specific hosting e.g. imgur.com, imgpile.com, UltraIMG.com, postimage.org etc.</p>
<p>But, why? Because, these services provide couple of benefits besides being free.</p>
<ul>
<li>Firstly, while loading the page, browser can parallelize loading from this hosts rather than your original blog hosts.</li>
<li>More often than not, these services use own <span class="caps">CDN</span>.</li>
<li>Often these services resize your uploaded images automatically to be used in different contexts, which enables you to choose the best-fit size on the fly but without doing it by hand.</li>
</ul>
<h2 id="other-tips">Other tips<a class="headerlink" href="#other-tips" title="Permanent link">¶</a></h2>
<h3 id="name-your-images-properly">Name your images properly<a class="headerlink" href="#name-your-images-properly" title="Permanent link">¶</a></h3>
<p>Because search-engines index images too. With proper names, images becomes relevant with the topic, hence potential to draw traffic.</p>
<h3 id="use-google-keyword-planner">Use Google Keyword planner<a class="headerlink" href="#use-google-keyword-planner" title="Permanent link">¶</a></h3>
<p>Even if you are not willing to blow money on ads, it will immensely help you to find out more searched for keywords.</p>
<h3 id="page-headers-controversial">Page-headers (controversial)<a class="headerlink" href="#page-headers-controversial" title="Permanent link">¶</a></h3>
<ul>
<li>Don’t use multiple 1<sup>st</sup>-level headers <code>H1</code> style. The main title link probably already have used it once. Look it up.
So, avoid it anymore, meaning avoid underlined-style(<code>=======</code>) or hash-style(single ‘#’) headers in your markdown files.</li>
</ul>
<p>However, 2<sup>nd</sup>-level <code>H2</code> tags can be (read ‘should be’) used multiple times. In case of markdown files, that’s <code>---------</code> underlines, or line starting with double-hash(‘##’).</p>
<h3 id="and-theres-lot-more">And, there’s lot more … ;)<a class="headerlink" href="#and-theres-lot-more" title="Permanent link">¶</a></h3>
<p>As I’ll gain more insights, I hope to grow this post. For now, this is a work-in-progress.<br>
Thanks for reading down so far. Adios !</p>