SEO Steps To Take Before Writing a Single Line of Code

You need to bake two things into the heart of your web app: keywords and a clever scalable content creation strategy

This is part of my Entreprenerd: Marketing for Programmers book, which is currently available to read for free online.

Research and Choose Keywords

The core of SEO is keyword analysis. Even before choosing a business name or buying a domain, the savvy SEO expert first researches the language and vocabulary used by the biggest proportion of their customers. Do more people search for “radio-controlled airplanes”, “rc airplanes”, or “flying airplane kits”? In order to answer this question, the SEO expert seeks out publicly available data, principally in Google’s Keyword Planner, so as to ground their marketing decisions in quantitative data. By knowing what keywords people type into search platforms, the marketer can build their strategy upon a solid foundation.

The marketer chooses a set of keywords with the intention of ranking highly in Google whenever these keywords are searched for. But because ranking highly is unlikely to happen by accident, the marketer will need to bake these keywords into their website’s information architecture and design, for example by including the keywords in their URLs, page titles, and internal links. In addition to this internal (or on-site) SEO, the marketer will ask external (or off-site) websites to link to them (backlinks) by including the keyword as part of the external link’s anchor text. After all this is done, these keywords will get one more use: They will form the basis for any paid advertising that employs keyword targeting.

Novice internet marketers often believe that the best keywords are the ones with the most monthly searches, as estimated in Google’s Keyword Planner. But this isn’t always true. When someone googles something generic like “financing”, there may be all sorts of different intentions behind their search. Is it a young family looking into financing a home? Is it a car dealership looking to partner with a bank to provide their customers with financing? Or is it a student researching the topic for an economics essay?

As you can see, the keyword “financing” means a lot of things to a lot of people. If a bank that sells mortgages were to attempt to rank for this keyword, then they would not only face heavy competition from similarly minded car financiers, but also receive a lot of irrelevant traffic that bounces instantly, an occurrence which (as I explain later) also hurts SEO efforts.

The marketer should think in terms of the commercial intent behind the search, as relevant to their business. That means that the bank above would only choose keywords specifically relating to home loans, as opposed to broader financing.

Their exact choice of keywords will hinge on how they trade off strategic factors. If they are just starting up their home loan business, there’s no realistic chance that they will even rank on the first page of search results for something as competitive as “home loan financing”, not to mention ranking #1 on that initial page. Instead, they would be better off choosing their battles wisely and investing their efforts into a more specialized keyword for which they have a realistic chance of ranking.

In strategically weighing the pros and cons of each keyword, the marketer will pay attention to the “Suggested Bid” column in Google’s Keyword Planner. This figure represents what Google thinks it would cost to advertise for this keyword on Google Adwords. Assuming that competitors are rational actors, the more expensive keywords are likely to generate more valuable traffic. So why not try to rank organically for these hot leads? (Proviso: High advertising fees mean that the keyword is valuable to someone, but not necessarily to someone selling whatever it is you offer.)

Yet another element of strategically choosing keywords involves sizing up the competition. The marketer will google each of their candidate keywords to see what kinds of pages currently rank highly. Because content is ultimately king in SEO, the marketer needs to be confident that they can produce something that blows the current ranking champions out of the water—not just in terms of quality of writing, but also in comprehensiveness, as well as impressiveness of accompanying illustrations and design.

Those of you with international ambitions, be warned: The optimal keyword might vary by country, meaning that not only must you be prepared to consult the data again when entering a new territory, but you also should have designed your application to support keyword-switching depending on the region. In my business that sells study notes, I erred by hard-coding the keyword “{X} notes” into practically every layer of my application, only to find that once I started operating in the USA, “{X} outlines” performed better. I found myself in the midst of a messy refactor due to that lack of foresight.

So far we’ve focused on direct-route keywords—descriptive keywords that map onto whatever the business sells (e.g., a piano seller targeting “buy pianos” or “piano showrooms”). But sometimes direct-route keywords are not a viable strategy—say if the competition is too steep, or if the search volume is too low. In this scenario, it’s up to the marketer to figure out indirect keywords, which map onto lateral activities and interests of the business’s customers (e.g., now the piano seller targets “piano practice regimes” or “local piano teacher listings”). This approach relies on the idea that people interested in X are also often interested in Y, and the marketer can learn about these correlations by observing what their audience blogs about, reads about in their spare time, shares on Facebook/Twitter, talks about on Reddit, and so on.

Bonus Points: Piggyback off your competitors

I know of a successful “how-to” publishing company that optimised its page for the keyword “{X} NOT For Dummies”, with the intention of siphoning off traffic that originally searched for books by their established competitor, as would be evidenced by searches such as “X For Dummies”. This upstart now ranks #5 on searches for their much larger and better-funded competitor—not bad.

Buy Keyword-Based Domain Names

Keywords contained in the domain name count towards SEO, especially when the domain name is an exact match for the query (i.e., where there are no other words present).^1-2-1 So, if you’re selling German bread, “tom-d-germanbread.co.uk” is better than a totally unrelated domain name like “tom-d-services.co.uk”, and “germanbread.co.uk” is better than “tom-d-germanbread.co.uk”.

There is a downside to keyword-based domain names: They are difficult to brand, and this has negative effects on many aspects of marketing. That said, there may be a way out of this problem: Buy both a branded domain name (i.e., one that is easy to spell, easy to remember, etc.) and lots of keyword-based domains. Next, configure each of these keyword-based domains to run on the same server, albeit with a different “skin”. This lets you present various keyword-specific “mini-sites” to the consumer, each of which will rank well for that one keyword. (It is advisable for these websites not to link with one another though, lest you give Google the impression that you are running a link farm). For an example of how this would work, look to Patrick McKenzie, creator of BingoCardCreator.com (a web application that helps teachers print Bingo cards for their students). A few years ago he bought a sizable collection of exact match domains for his highest-performing keywords. For example, to help himself rank for “Halloween Bingo Cards”, he bought “http://www.halloweenbingocards.net/” and created a special HalloweenBingoCards mini-site. For his trouble (and the 10 bucks it cost him to pay for the domain name), he scored himself 27,000 hits that Halloween, triggering 15 sales at $35 a pop. This works out as a total of $535 in recurring sales for the $10 he pays yearly for the domain name. Now imagine doing this en masse with hundreds of domain names in a business that isn’t quite so niche.

Choose Your Top-Level Domain Carefully

Top-level domains—the part after the last dot in the domain name (e.g., .com, .org, .co.uk, .de)—are categorised as either bound to a particular country (.co.uk, .de) or unattached (.com, .org). If your initial domain was country-specific but you later wish to internationalise your business, you’re going to end up with headaches.

Case in point: I started my business with the domain-name “oxbridgenotes.co.uk”. At that time, my business only served British customers and I had limited business ambitions, so I thought that a .co.uk would be A-OK. But with time my ambitions outgrew my UK-bound domain, and, to my dismay, I discovered that any suitably global URL structure that I could create (e.g., subdomains like “fr.oxbridgenotes.co.uk” or “de.oxbridgenotes.co.uk”) would not only look weird to customers in these new regions, but also fail at SEO because search engines predominantly display .co.uk content to UK searchers. This remains true even if contrary information is given in the subdomain or folder information (i.e., “oxbridgenotes.co.uk/fr” still displays in the UK, not France). All in all, the top-level domain dominates the subdomain in terms of SEO signalling to both search engines and humans, and there isn’t a whole lot one can do about it.

In order to move beyond the UK, I had to buy a new domain, this time a generic, non-country-specific one (“oxbridgenotes.com”). From this base, I could add country-specific subdomains or folders as I wished, and Google would respect this structure in their rankings.

But there was a big downside to changing domains so late in my product cycle. My .co.uk domain had been live for years and attracted good organic traffic. Naturally, I didn’t want to shut it down. In the end, I decided to maintain both the old domain (“oxbridgenotes.co.uk”) and the new one (“oxbridgenotes.com”), running the pair on the one server, configured to grudgingly gel together via an annoyingly complicated and bug-prone routing setup.

Aside from programming complication, this splitting of my website into two domains was far from ideal for another reason: I ended up spreading my “SEO juice” across two different domain names—the old .co.uk one, and the .com which was virgin and therefore weak in search rankings. A new product released on the .com wouldn’t rank at all, whereas the same thing on the .co.uk would do very well. Had I started off with a single domain name from the very start, then I wouldn’t have had this two-tier SEO and brand problem; everything released onto the website would have ranked respectably.

(I am aware that there are facilities to officially move domain through Google Search Console, but I was and still am too risk-averse to toy with this particularly treacherous type of fire.)

In conclusion, I would advise my former self to be a little more optimistic about eventual international expansion and start the website on a top-level domain that isn’t tied to a particular country.

Avoid Keyword Cannibalisation

Google views the web as individual pages, not as entire domains. So, contrary to popular myth, there is no such thing as ranking your domain for a keyword. In fact, what’s really ranking there is your homepage, which often gets the most internal and external links.

Instead, you ought to think of your website as a collection of individual pages that compete with each other and every other web page out there to rank for some particular keyword or keyphrase. This competition between pages on your own domain leads to a problem known as “keyword cannibalisation”.

Imagine a webmaster slapping the keyword “wordpress templates” on 200 different product pages of their digital downloads website. Then no single product page is obviously pre-eminent to a third party wishing to direct their customers to this source of wordpress templates. As such, whenever a third party links to our webmaster, the link could point to any one of these 200 pages instead of concentrating on the most relevant one. The end result is that none of the pages on the digital downloads domain reach the strength needed to appear on the first page of Google results.

Furthermore, this sort of “tactic” is damaging to the digital download seller’s overall conversion rates. Some product pages might be better at converting visitors than others, yet the better and the average pages show up side-by-side in Google search results. As such, a random visitor is as likely to land on a page that’s poor at converting as a page that’s a conversion winner.

The webmaster would be better advised to choose one unique compound keyword for each page, such as “wordpress templates for pubs”, “bootstrap wordpress templates”, or “easy-to-use wordpress templates”, and focus all their on-page SEO signals—the URL, the title tag, etc.—on this narrower goal.

What about the original catch-all parent keyword, “wordpress templates”? The webmaster should invest in creating a stellar web page for this compound keyword, something that people find very valuable and therefore link-worthy. Then, they should ensure that every page focused on narrower compound keywords should link back to this parent page, thereby attributing some SEO juice via internal linking.

Now our webmaster has 199 internal links to their “wordpress templates” page, and an obvious catch-all place for third parties to link to—a useful result supporting general and specific SEO!

Concoct a Scalable Content Creation Strategy

Roughly speaking, the more unique and useful content your website has, the better you’ll do. A full-fledged SEO campaign therefore includes an intelligent and cost-effective way to generate this precious basic ingredient of ranking. This idea is eloquently captured in a buzzword coined by Patrick McKenzie: scalable content creation.^1-2-2

Your specific choice of scalable content creation strategy will affect the eventual form your business takes, so it’s important for you to choose your course before writing your first line of code. The key piece of homework is answering the question, “How do I create large amounts of content valuable to customers at a low cost of time, effort, and/or money?”

Notice my emphasis on “valuable to customers”—too many websites forget this and pump out any old drivel for content. This is unwise though, because Google can detect dissatisfied visitors by the fact that they press the Back button, by their time spent on your page, and/or by their repeating their previous search. If this happens often, Google interprets this as a strong signal that your web page sucks, and they respond by banishing you to the lowest ranks of the search ladder.

One option to create content is to write it yourself. This can be a good idea to start with, but it doesn’t scale. Were you to spend your entire working day writing, you might only be able to produce at a rate of five pages per day, while not investing time in the rest of your business. Not ideal. So what are your other options?

i. Incentivise your users into generating public content

Whereas you are but one person, your users may number in the thousands or even in the millions. By giving them some advantage or value (which may be as simple as a place to express themselves and be heard), you can incentivise them to create content on your platform—content that will be picked up by search engines and employed to attract even more users, who will themselves produce content, continuing and amplifying your reach ad infinitum.

There are all sorts of user-generated content creation systems out there in the wild, and I’d like to share some of my favourites:

1. Github.com offers a free code-hosting service for anyone with open-source code projects. The public activity of these legions of programmers leaves behind millions of pages of content, composed of code project readmes (which Github automatically converts into HTML), issue-tracking discussions, code review comments, and so on. Github gets free content, and the programmers get a free place to host their open-source code while showing off their coding chops to their peers. Everyone wins.

2. People love giving opinions, and by playing to this innate human drive, websites like Amazon generate an unending stream of content by asking anyone (not just their customers) to review the products they sell. And considering that Amazon sells pretty much everything, this amounts to a whole boatload of review content. Another great source of user-generated content are the commenting sections underneath online newspaper articles. Sensitive topics—like Palestine or migration—often compel thousands of readers to contribute their opinions. To see this content-creation strategy done at its finest, check out the Guardian’s website. Sometimes every word of official editorial content yields over a hundred words in public commentary.

3. In my primary business, Oxbridge Notes, the average author we publish on our platform uploads about 100 pages of notes. These well-organized packages of pages constitute an eloquent lode of user-generated content that has never been on the internet before. The customers who buy these packages like to view free samples before buying, so we wrote an algorithm that turns the first 20% of each uploaded package into such a sample. This gives us a yield of about 20 pages of unique content per author. To capitalise on this content, we create landing pages for each of these free samples, complete with a PDF and an HTML transcription of the text. This strategy is completely automated, yet accounts for 94% of our organic traffic. The average time spent on each of these generated pages is 1.5 minutes, showing that customers find the text there to be valuable.

4. YouTube and SoundCloud tap into people’s desire to express their creativity and share the results with the world. By providing platforms for hosting video and music, both websites have drawn huge, global audiences. And if that weren’t already enough, these platforms also offer comment sections where admirers and haters duke it out regularly for days and sometimes even years, leaving behind even more indexable content.

5. Stack Overflow, a popular question and answer forum for programmers, envelops both its questioners and its question-answerers with addictive gamification, consisting of reputation points, badges, and specially unlocked website privileges. But that’s not all—not by a long shot. Stack Overflow rewards their question-answerers emotionally through granting them “proof” that they are right/helpful, via the mechanism of a questioner “accepting” one answer above all other answers from other contributors. To top it all off, Stack Overflow has convinced many people that these reputation points count towards an employer’s hiring decision, enabling Stack Overflow to turbocharge their content generation with the fuel of career ambition (while boosting their business model: job ads). Indeed, some companies employing programmers have started weighing Stack Overflow reputation points in determining a prospective employee’s ability, so this approach has become a self-fulfilling prophesy.

6. Ojbc.io, an online magazine for iPhone programmers, started by building a slick-looking web platform and writing their first issue with in-house authors. They then polished their content with the help of a professional editor and the addition of a pretty magazine cover. This distinguished them from your average blog, and they were able to leverage this edge to convince thought leaders to contribute articles to their subsequent magazine issues. In exchange for their initial efforts with that first issue, they managed to convince brilliant people to contribute world-class content to the subsequent 23.

ii. Automatically generate content from algorithms

Automatically generating content (e.g., by combining database information, scraping, and clever algorithms) often only results in reprehensible, unreadable spam-level output. That said, in rare cases a clever webmaster has the skills and taste to pull off automatic/algorithmic content generation, and they may benefit magnificently for their efforts.

One such example is Versus.com, which generates English language sentences from data points about comparable electronic devices (e.g., phones or video game consoles). These data points, presumably gathered from manufacturer websites or massive retailers, are about things like the relative battery life of devices, their processing power, their screen size, or their backwards compatibility. Relying on this strategy of turning API-consumable data points into HTML sentences useful to humans, they have acquired monthly traffic in the millions. And their content, despite having been generated by an algorithm, is sufficiently compelling to ensure long and often heated discussions by real humans in the comment section appended to each page.

iii. Hire content generators

Patrick McKenzie hired a school teacher to write bingo cards on given topics for his website Bingo Card Creator. This school teacher produced 30 sets of cards every month through a custom CMS that Patrick built into his web application. Once these bingo cards were uploaded, the Bingo Card Creator website algorithmically turned them into search-engine optimised web pages.

The key to this strategy was the transformation of content creation into a repeatable task that could be performed by a non-technical collaborator and repeated thousands of times. After a while, this built up a formidable bank of content, all without Patrick having to contribute any more of his own time.

A similar trend in the ecommerce world involves hooking freelance writers into a website’s CMS so as to write unique product descriptions, say for the cameras or the phones being sold. The motivation here is that ecommerce websites relying only on the manufacturer’s product description often have product pages with insufficient word counts for Google to bother indexing them. And even if Google did index these pages, there would probably be insufficient unique content to outrank the gazillion other retailers selling the exact same product with the exact same description texts.


More Articles: Click here for full archive

Minimum Viable Backups for Web Apps

A list of the various nooks and crannies needing backing up followed by a look at the most common failure modes of a backup system


Dworkinian Integrity and (Sub-)Symbol Minimisation: Prescriptions for Consistent Software

How the perfect unbuilt API already exists, floating in the zeitgeist, and how it is up you to pay attention to the totality in front of you


Janki Method Refined

Tips, shortcuts and revisions to the original method