I think ... - seo-friendlyhttps://blog.kmonsoor.com/2017-01-07T00:00:00+06:00Pelican Static sites - SEO Optimization2017-01-07T00:00:00+06:002017-01-07T00:00:00+06:00Khaled Monsoortag:blog.kmonsoor.com,2017-01-07:/pelican-how-to-make-seo-friendly/<p>Usually <code>Pelican</code> static-site generator is not very concerned about <span class="caps">SEO</span> of the generated site. Related themes and their templates also don&rsquo;t take it much seriously. But you shouldn&rsquo;t loose <span class="caps">SEO</span>,&nbsp;right?</p><p>Writing is a hard job, especially it is your hobby beside of your day job. But what&rsquo;s the benefit if nobody reads it just because they couldn&rsquo;t find&nbsp;it.</p> <p>Usually <code>Pelican</code> <a href="https://github.com/getpelican/pelican">static-site generator</a> is not very concerned about <span class="caps">SEO</span> of the generated site mainly because it&rsquo;s not that focused on commercial usage. That&rsquo;s what I felt. Related themes and their templates also don&rsquo;t take it much seriously. But you shouldn&rsquo;t loose <span class="caps">SEO</span> just because you migrated from Wordpress or whatever you were using previously,&nbsp;right?</p> <p>Often times theme-authors focus more on look-n-feel of the theme, but not so much on the <span class="caps">SEO</span>&nbsp;concerns.</p> <p>That&rsquo;s this blog about; let&rsquo;s fix&nbsp;that.</p> <h4 id="note-1-about-other-static-site-generators">Note 1: About other static-site generators<a class="headerlink" href="#note-1-about-other-static-site-generators" title="Permanent link">&para;</a></h4> <p>Although the following discussions <span class="amp">&amp;</span> codes mostly are specific to <a href="https://github.com/getpelican/pelican">Pelican</a> templates which uses <a href="http://jinja.pocoo.org/">jinja2</a> templating language, the concepts and concerns here are applicable to most static-site generators and their&nbsp;themes.</p> <h4 id="note-2-im-in-no-way-an-seo-expert">Note 2: I&rsquo;m, in no way, an <span class="caps">SEO</span> expert<a class="headerlink" href="#note-2-im-in-no-way-an-seo-expert" title="Permanent link">&para;</a></h4> <p>This writeup is just a collection of my findings while correcting my blog&rsquo;s <span class="caps">SEO</span> course; fixing the stupid mistakes. Also, this isn&rsquo;t a commercial site. There are lot more and in-depth aspects of <span class="caps">SEO</span> optimization other than the following, that can be very important for commercial&nbsp;projects.</p> <h2 id="getting-started">Getting started<a class="headerlink" href="#getting-started" title="Permanent link">&para;</a></h2> <h3 id="lookup-for-missing-pieces">Lookup for missing pieces<a class="headerlink" href="#lookup-for-missing-pieces" title="Permanent link">&para;</a></h3> <p>Make sure all the linked resources(links, images, <span class="caps">CSS</span> <span class="amp">&amp;</span> <span class="caps">JS</span> files) that you&rsquo;ve used in your pages are valid. For that check browser&rsquo;s <code>console</code> in <code>Developers tools</code> for any errors e.g. unavailable urls, faulty html/css&nbsp;etc.</p> <h3 id="know-the-critical-spots">Know the critical spots<a class="headerlink" href="#know-the-critical-spots" title="Permanent link">&para;</a></h3> <ul> <li><strong>pelicanconf.py</strong> - It&rsquo;s usually in your root folder of the&nbsp;site.</li> <li><strong>base.html</strong> - It&rsquo;s in the <code>templates</code> folder of the theme folder that you are using. Changes here will impact all <span class="caps">HTML</span> pages generated by&nbsp;Pelican.</li> <li><strong>article.html</strong> - In same folder as <code>base.html</code>. Changes here will impact only articles&rsquo;&nbsp;pages.</li> </ul> <h2 id="avoiding-duplication">Avoiding Duplication<a class="headerlink" href="#avoiding-duplication" title="Permanent link">&para;</a></h2> <h3 id="avoid-your-source-getting-indexed-by-google">Avoid your source getting indexed by Google<a class="headerlink" href="#avoid-your-source-getting-indexed-by-google" title="Permanent link">&para;</a></h3> <p>If the source of your blog is not sourced-open, meaning the content is not in a open-sourced repo, this isn&rsquo;t your concern. But if it is, it should be a&nbsp;concern. </p> <p>The reason is that <code>Github.com</code>(or where your repo is hosted on) is stronger domain than yours, so Google &ldquo;sees&rdquo; that your site-contents (even though it&rsquo;s yours) also is on Github. So, there&rsquo;s a high chance that it&rsquo;ll mark your contents as &ldquo;duplicate&rdquo;. And, duplicate contents gets heavy hammers from Google&rsquo;s <span class="caps">SEO</span> point-of-view. Searching any writeup from yours, even it&rsquo;s very unique on Internet, Google search will show both your site and the repo as well, possibly links from your site will be on the lower&nbsp;side.</p> <p>It begs the question how to avoid that. It&rsquo;s not&nbsp;complicated.</p> <p>For example, Github.com&rsquo;s <a href="https://github.com/robots.txt">robot.txt</a> allows Google (or any search engine for that matter) to index only the <code>master</code> branch of any open-source repo. So, if your repo don&rsquo;t have any branch named <code>master</code>, it won&rsquo;t be indexed. That&rsquo;s it. Rename your &ldquo;master&rdquo; as &ldquo;live&rdquo;, &ldquo;main&rdquo;, &ldquo;production&rdquo; or whatever you feel&nbsp;like.</p> <ol> <li>Create a new&nbsp;&ldquo;master&rdquo;</li> </ol> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>$ git checkout -b new-master <span class="linenos" data-linenos="2 "></span>$ git push -u origin new-master </code></pre></div> <ol start="2"> <li>Tell Github about new&nbsp;move</li> </ol> <p><img alt="change repo-settings on Github" src="http://i.imgur.com/wjf6zwul.png"></p> <ol start="3"> <li>Now, delete <code>master</code> branch from your&nbsp;repo.</li> </ol> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>$ git branch -d master <span class="linenos" data-linenos="2 "></span>$ git push origin :master </code></pre></div> <h3 id="utilize-relcanonical-link">Utilize <code>rel="canonical"</code> link<a class="headerlink" href="#utilize-relcanonical-link" title="Permanent link">&para;</a></h3> <p>This is kind of a must-do for avoiding being marked as duplicate content. This will also defend you against automatic content-scraping schemes by always having a pointer to your original&nbsp;source. </p> <p>To utilize it on Pelican, add or make sure that a <code>&lt;link&gt;</code> element with the attribute <code>rel="canonical"</code> to the <code>&lt;head&gt;</code> section of your base template named <code>base.html</code>.</p> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>{% if article %} <span class="linenos" data-linenos="2 "></span>&lt;link rel=&quot;canonical&quot; href=&quot;{{ SITEURL }}/{{ article.url }}&quot;/&gt; <span class="linenos" data-linenos="3 "></span>{% endif%} </code></pre></div> <p>For example, if you see this page&rsquo;s source (pressing <span class="caps">CTRL</span>+u), you should see something&nbsp;like</p> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>&lt;link rel=&quot;canonical&quot; href=&quot;https://blog.kmonsoor.com/pelican-how-to-make-seo-friendly/&quot;/&gt; </code></pre></div> <h3 id="proper-title-of-each-page">Proper <strong>&lt; title &gt;</strong> of each page<a class="headerlink" href="#proper-title-of-each-page" title="Permanent link">&para;</a></h3> <p>Every page on your site should have a proper title. For search engines, it represents the page. It should concisely reflect a page&rsquo;s content. But try to keep it less 60 characters or search-engines may choose to truncate it. Use each characters&nbsp;wisely.</p> <p>It may look like this in your <code>base.html</code>.</p> <p><div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>{% if article %} <span class="linenos" data-linenos="2 "></span>&lt;title&gt;{{ article.title }} -- {{ TAGLINE }}&lt;/title&gt; <span class="linenos" data-linenos="3 "></span>{% else %} <span class="linenos" data-linenos="4 "></span>&lt;title&gt;{{ TAGLINE }}&lt;/title&gt; <span class="linenos" data-linenos="5 "></span>{% endif%} </code></pre></div> The <strong>else</strong> clause here is to ensure that non-article pages also get a title, even it&rsquo;s just your <strong>tagline</strong> defined in <code>pelicanconf.py</code>.</p> <h3 id="meta-descriptions">Meta descriptions<a class="headerlink" href="#meta-descriptions" title="Permanent link">&para;</a></h3> <p>Include a meta-description to be added on each page of your site. Though it may don&rsquo;t directly hit <span class="caps">SEO</span> ranking, but it appears as a snippet on the search page. So, user should get a proper glimpse of what your page gonna talk about. Make sure your theme uses Pelican&rsquo;s <code>summary</code>-tagged text for this purpose. Else, ensure it yourself by editing <code>base.html</code>.</p> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>{% if article and article.summary %} <span class="linenos" data-linenos="2 "></span>&lt;meta name=&quot;description&quot; content=&quot;{{ article.summary|striptags }}&quot;/&gt; <span class="linenos" data-linenos="3 "></span>{% else %} <span class="linenos" data-linenos="4 "></span>&lt;meta name=&quot;description&quot; content=&quot;{{ SITE_SUMMARY }}&quot;/&gt; <span class="linenos" data-linenos="5 "></span>{% endif%} </code></pre></div> <h3 id="use-search-console-extensively">Use <code>search-console</code> extensively<a class="headerlink" href="#use-search-console-extensively" title="Permanent link">&para;</a></h3> <p>Google&rsquo;s <a href="https://www.google.com/webmasters/tools/home">Search-console, previously known as webmaster-tools</a> is your friend. Utilize it as far as you can&nbsp;go.</p> <h4 id="extensively-use-pagespeed-insights">Extensively use <a href="https://developers.google.com/speed/pagespeed/insights/">PageSpeed Insights</a><a class="headerlink" href="#extensively-use-pagespeed-insights" title="Permanent link">&para;</a></h4> <p>To understand where are current bottlenecks of your site, this tools gives quite a lot insights. Address those&nbsp;one-by-one.</p> <h4 id="set-preferred-version-of-your-site">Set preferred version of your site<a class="headerlink" href="#set-preferred-version-of-your-site" title="Permanent link">&para;</a></h4> <p>If you have <code>www</code>, <code>http</code> and <code>https</code> versions of your site, tell Google here which one is preferred. It&rsquo;s only applicable to your domain-root. Once applied and Google re-indexed your site, all the search-results from your site will show that preferred version of your&nbsp;site.</p> <p><img alt="setting preference for www or non-www version" src="http://i.imgur.com/51JY1oel.png"></p> <p>You have to add and do it same both for <code>http</code> and <code>https</code> version of your site, if you have&nbsp;both.</p> <p>If you have both, <strong>either</strong> you can use a javascript-code snippet in the <code>&lt;head&gt;</code> of <code>base.html</code> to redirect any <code>http</code> page to its <code>https</code> counterpart.</p> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>&lt;script type=&quot;text/javascript&quot;&gt; <span class="linenos" data-linenos="2 "></span> var host = &quot;your-site.com&quot;; <span class="linenos" data-linenos="3 "></span> if ((host == window.location.host) &amp;&amp; (window.location.protocol != &quot;https:&quot;)) <span class="linenos" data-linenos="4 "></span> window.location.protocol = &quot;https&quot;; <span class="linenos" data-linenos="5 "></span>&lt;/script&gt; </code></pre></div> <p><strong>or</strong>, if your site is served through <span class="caps">NGINX</span> or Apache, you can do it through site&rsquo;s <code>.htaccess</code> file, by adding the&nbsp;following.</p> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos="1 "></span>RewriteEngine On <span class="linenos" data-linenos="2 "></span>RewriteCond %{HTTPS} !on <span class="linenos" data-linenos="3 "></span>RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} </code></pre></div> <p><strong>Or</strong>, if you are using CloudFlare <span class="caps">CDN</span>, you can create a page-rule for that as I have shown&nbsp;below.</p> <p><img alt="always-https by CloudFlare page-rules" src="http://i.imgur.com/9ISFbtvm.png"></p> <h4 id="check-index-status">Check index-status<a class="headerlink" href="#check-index-status" title="Permanent link">&para;</a></h4> <p>Once in a while Check for Google&rsquo;s index status of your site on the <code>search-console</code>. Look for error messages or&nbsp;suggestions.</p> <p>After every major change in your site&rsquo;s structure, make sure Google bots &ldquo;know&rdquo; about it. You can somewhat expedite the process by <a href="https://www.google.com/webmasters/tools/submit-url">manually submitting</a> your&nbsp;site.</p> <h3 id="include-opengraph-data">Include <code>OpenGraph</code> data<a class="headerlink" href="#include-opengraph-data" title="Permanent link">&para;</a></h3> <p>Make sure each of your pages is including proper <a href="http://ogp.me/">OpenGraph</a> tags, e.g. <code>og:title</code>, <code>og:content</code> etc., in your&nbsp;template.</p> <p>Though, OpenGraph originated from facebook Inc., these tags are now widely used by other social engines, even by Google+. In absence of Twitter tags, Twitter also uses these <code>og</code> tags. Try to include both <code>og:</code> and <code>twitter:</code> tags. Proper data in these tags makes your article cleanly-sharable in these social&nbsp;sites.</p> <p>The below snippet that <a href="https://github.com/kmonsoor/blog.kmonsoor.com/blob/pelican-how-to-make-seo-friendly/plumage/templates/base.html">I use myself</a> can serve as a starting&nbsp;point.</p> <div class="highlight"><pre><span></span><code><span class="linenos" data-linenos=" 1 "></span>&lt;!-- OpenGraph protocol tags: http://ogp.me/ --&gt; <span class="linenos" data-linenos=" 2 "></span>&lt;!-- originally adopted to be used for: https://blog.kmonsoor.com --&gt; <span class="linenos" data-linenos=" 3 "></span>&lt;meta property=&quot;og:site_name&quot; content=&quot;{{ SITENAME }}&quot; /&gt; <span class="linenos" data-linenos=" 4 "></span>&lt;meta property=&quot;og:type&quot; content=&quot;article&quot; /&gt; <span class="linenos" data-linenos=" 5 "></span>{% if article and article.title %} <span class="linenos" data-linenos=" 6 "></span>&lt;meta property=&quot;og:title&quot; content=&quot;{{ article.title }} -- {{ TAGLINE }}&quot; /&gt; <span class="linenos" data-linenos=" 7 "></span>&lt;meta property=&quot;og:url&quot; content=&quot;{{ SITEURL }}/{{ article.url }}&quot; /&gt; <span class="linenos" data-linenos=" 8 "></span>{% endif%} <span class="linenos" data-linenos=" 9 "></span>{% if article and article.summary %} <span class="linenos" data-linenos="10 "></span>&lt;meta property=&quot;og:description&quot; content=&quot;{{ article.summary|striptags }}&quot; /&gt; <span class="linenos" data-linenos="11 "></span>{% else %} <span class="linenos" data-linenos="12 "></span>&lt;meta name=&quot;og:description&quot; content=&quot;{{ SITE_SUMMARY }}&quot;/&gt; <span class="linenos" data-linenos="13 "></span>{% endif%} <span class="linenos" data-linenos="14 "></span>{% if article and article.date %} <span class="linenos" data-linenos="15 "></span>&lt;meta property=&quot;article:published_time&quot; content=&quot;{{ article.date }}&quot; /&gt; <span class="linenos" data-linenos="16 "></span>{% endif%} <span class="linenos" data-linenos="17 "></span>{% if article and article.modified %} <span class="linenos" data-linenos="18 "></span>&lt;meta property=&quot;article:modified_time&quot; content=&quot;{{ article.modified }}&quot; /&gt; <span class="linenos" data-linenos="19 "></span>{% endif%} <span class="linenos" data-linenos="20 "></span>&lt;!-- End of OpenGraph protocol tags --&gt; <span class="linenos" data-linenos="21 "></span> <span class="linenos" data-linenos="22 "></span>{% if TWITTER_USERNAME %} <span class="linenos" data-linenos="23 "></span>&lt;meta name=&quot;twitter:site&quot; content=&quot;@{{ TWITTER_USERNAME }}&quot; /&gt; <span class="linenos" data-linenos="24 "></span>&lt;meta name=&quot;twitter:creator&quot; content=&quot;@{{ TWITTER_USERNAME }}&quot; /&gt; <span class="linenos" data-linenos="25 "></span>{% endif%} <span class="linenos" data-linenos="26 "></span>&lt;meta name=&quot;twitter:image&quot; content=&quot;INSERT-YOUR-TWITTER-IMAGE-LINK&quot; /&gt; <span class="linenos" data-linenos="27 "></span>{% if article and article.summary %} <span class="linenos" data-linenos="28 "></span>&lt;meta name=&quot;twitter:card&quot; content=&quot;{{ article.summary|striptags }}&quot; /&gt; <span class="linenos" data-linenos="29 "></span>{% else %} <span class="linenos" data-linenos="30 "></span>&lt;meta name=&quot;twitter:card&quot; content=&quot;{{ SITE_SUMMARY }}&quot;/&gt; <span class="linenos" data-linenos="31 "></span>{% endif%} </code></pre></div> <h4 id="notes">Notes<a class="headerlink" href="#notes" title="Permanent link">&para;</a></h4> <ul> <li>Grab your own Twitter&rsquo;s avatar link do the&nbsp;following:</li> <li>Go to your Twitter profile&nbsp;page</li> <li>Right-click on your profile&nbsp;picture</li> <li> <p>Select &ldquo;Copy image address&rdquo; / &ldquo;Copy image&nbsp;link&rdquo;</p> </li> <li> <p>For <code>OpenGraph</code> tags you may also consider to use <a href="https://github.com/whiskyechobravo/pelican-open_graph/tree/master">pelican-opengraph</a>&nbsp;plugin.</p> </li> <li> <p>For all these to work properly, make sure <code>SITEURL</code>, <code>TAGLINE</code>, <code>SITE_SUMMARY</code>, <code>TWITTER_USERNAME</code> are properly defined in your <code>pelicanconf.py</code> alongwith in <code>publishconf.py</code> files. Please remember that definitions in <code>publishconf.py</code> only apply when you using <code>make publish</code> command.</p> </li> </ul> <h3 id="loading-performance">Loading Performance<a class="headerlink" href="#loading-performance" title="Permanent link">&para;</a></h3> <h4 id="compress-everything">Compress everything<a class="headerlink" href="#compress-everything" title="Permanent link">&para;</a></h4> <ul> <li>PageSpeed impacts <span class="caps">SEO</span> directly. Google punishes slow-site especially when search is made on a mobile device. Mobile-optimized sites will definitely rank higher on searches from&nbsp;mobile-devices.</li> </ul> <p>So, make sure all static files are compressed. If not, compress your themes theme&rsquo;s <span class="caps">JS</span>, <span class="caps">CSS</span> files yourself to a <em>.min.</em> version and then replace those in the template files of the&nbsp;theme.</p> <p>Or, better to use <a href="https://github.com/getpelican/pelican-plugins/tree/master/gzip_cache">gzip_cache</a> for gzipping all the <span class="caps">HTML</span> files statically, also and <a href="https://github.com/getpelican/pelican-plugins/tree/master/yuicompressor">yuicompressor</a> plugin for compressing <span class="caps">JS</span> <span class="amp">&amp;</span> <span class="caps">CSS</span> files for Pelican. Those will make sure that, upon build, everything is&nbsp;compressed.</p> <h4 id="utilize-cdn-if-you-can">Utilize <span class="caps">CDN</span> if you can<a class="headerlink" href="#utilize-cdn-if-you-can" title="Permanent link">&para;</a></h4> <ul> <li> <p>Use <span class="caps">CDN</span>-ed versions of common libraries(e.g. jQuery, Bootstrap etc.) rather than hosting your own copy, unless your theme actively modified it. Look it up on <a href="cdnjs.cloudflare.com">CloudFlare cdnjs</a>, <a href="https://cdnjs.com">cdnjs</a>, or on <a href="https://www.jsdelivr.com">jsdelivr</a> etc. and use those&nbsp;links.</p> </li> <li> <p>Try to use a <span class="caps">CDN</span> for edge-distribution of your site. I only know of CloudFlare that provide this service for free for a single site. There might be others. <span class="caps">CF</span> also make managing <span class="caps">DNS</span> configuration little&nbsp;breezy.</p> </li> </ul> <h3 id="engage-commenting">Engage commenting<a class="headerlink" href="#engage-commenting" title="Permanent link">&para;</a></h3> <p>While serving a static site, integrating a commenting-system looks a little far-fetched. However, blogs without proper commenting system feels kinda lame sometimes. Of course, <span class="caps">YMMV</span>.</p> <p>But, it&rsquo;s not difficult; easily can be done by systems like <a href="https://disqus.com/">Disqus</a> etc. I&rsquo;m not affiliated with them, by the&nbsp;way.</p> <h3 id="host-images-separately">Host images separately<a class="headerlink" href="#host-images-separately" title="Permanent link">&para;</a></h3> <p>Host all the images separately that you&rsquo;ve used in your articles. Use image-specific hosting e.g. imgur.com, imgpile.com, UltraIMG.com, postimage.org&nbsp;etc.</p> <p>But, why? Because, these services provide couple of benefits besides being&nbsp;free.</p> <ul> <li>Firstly, while loading the page, browser can parallelize loading from this hosts rather than your original blog&nbsp;hosts.</li> <li>More often than not, these services use own <span class="caps">CDN</span>.</li> <li>Often these services resize your uploaded images automatically to be used in different contexts, which enables you to choose the best-fit size on the fly but without doing it by&nbsp;hand.</li> </ul> <h2 id="other-tips">Other tips<a class="headerlink" href="#other-tips" title="Permanent link">&para;</a></h2> <h3 id="name-your-images-properly">Name your images properly<a class="headerlink" href="#name-your-images-properly" title="Permanent link">&para;</a></h3> <p>Because search-engines index images too. With proper names, images becomes relevant with the topic, hence potential to draw&nbsp;traffic.</p> <h3 id="use-google-keyword-planner">Use Google Keyword planner<a class="headerlink" href="#use-google-keyword-planner" title="Permanent link">&para;</a></h3> <p>Even if you are not willing to blow money on ads, it will immensely help you to find out more searched for&nbsp;keywords.</p> <h3 id="page-headers-controversial">Page-headers (controversial)<a class="headerlink" href="#page-headers-controversial" title="Permanent link">&para;</a></h3> <ul> <li>Don&rsquo;t use multiple 1<sup>st</sup>-level headers <code>H1</code> style. The main title link probably already have used it once. Look it up. So, avoid it anymore, meaning avoid underlined-style(<code>=======</code>) or hash-style(single &lsquo;#&rsquo;) headers in your markdown&nbsp;files.</li> </ul> <p>However, 2<sup>nd</sup>-level <code>H2</code> tags can be (read &lsquo;should be&rsquo;) used multiple times. In case of markdown files, that&rsquo;s <code>---------</code> underlines, or line starting with&nbsp;double-hash(&lsquo;##&rsquo;).</p> <h3 id="and-theres-lot-more">And, there&rsquo;s lot more &hellip; ;)<a class="headerlink" href="#and-theres-lot-more" title="Permanent link">&para;</a></h3> <p>As I&rsquo;ll gain more insights, I hope to grow this post. For now, this is a work-in-progress.<br> Thanks for reading down so far. Adios&nbsp;!</p>