Recently, I’ve been diving into organic content. As I’ve learned more and more about developing and promoting content, link building has been the hardest part. It’s often avoided since it is so difficult, but in order to build your domain authority, you need high value websites to link to your website.
There are many ways to acquire links from these blog posts, but the first step is a lot like sales. You need to develop a list of the targets that are most valuable to your blog.
I found this fantastic research I recommend to everyone interested in link building from Optimize Smart’s Link Building strategy. In it, they give 200 valuable queries that you can use to search for the top domains in your industry or using the keywords you care about.
Sadly, who has the time or determination to search Google 200 times and collect that information.
In order to run this script, you need four libraries. You will need the pprint library to visualize your search results. You will need the requests library to make an HTTP request to the Custom Search Library.
You will need to import the pandas library to store our information in a CSV and the JSON_Normalize function from pandas to turn a JSON file that Google returns into a CSV quickly.
import pprint
import requests
from pandas.io.json import json_normalize
import pandas as pd
I wrote this program so that any company can edit their keywords, industry name, competitor name, and other features that they care about to easily find the links involved. Simply replace the keywords with the values that make sense for your company and they will work.
industryName = 'YOUR_INDUSTRY_NAME' competitorName = 'YOUR_COMPETITOR' keyword = 'YOUR_keyword' industryVertical = 'YOUR_VERTICAL_NAME' city_name = 'YOUR_CITY_NAME' productName = 'YOUR_PRODUCT_NAME'
The most important method out of this Python script is our Search method. We want to make our for loop easy to change so using a method makes it easy to change.
You first need to include your Google Search Key and Google Search Engine. You can find the tutorial on how to create a Google search engine in my article about how to search Google automatically.
We will then take that query and iterate through the Google’s limit as many times as it takes to reach the limit.
We will take the query and pass it through to an HTTP request. If there are any items from the search, we’re going to normalize it into a DataFrame using the json_normalize method.
We will then concatenate those results together and return them for the method.
def search_link_building_query(query): search_key = 'YOUR_GOOGLE_SEARCH_KEY_HERE' search_cx = 'YOUR_GOOGLE_SEARCH_ENGINE_HERE' search_url = 'https://www.googleapis.com/customsearch/v1?' search_term = query batch = 20 limit = 10 offset = 0 items = pd.DataFrame() while offset < limit: params = {'key': search_key, 'cx': search_cx, 'q': search_term, 'num': batch, 'start': offset} full_query = search_url response = requests.get(url=full_query, params=params) if 'items' in response.json(): new_items = json_normalize(response.json()['items']) frames = [items, new_items] items = pd.concat(frames, sort=False, ignore_index=True) print(new_items) offset += batch return items
This will be the meat of the program. We need to transfer all of the search queries from the article to our program. We want to make sure that this is possible for any user so we’re going to write it so that a variable does not have quotations around it while any nonvariable query has quotes around it and a space in front.
We also want to make an array of every single query so that we can easily iterate through with a for loop.
The end result will look like the code underneath.
search_1 = industryName + " intitle:interview -job" search_2 = industryVertical + " intitle:interview -job" search_3 = competitorName + " intitle:interview -job search_4 = industryName + " inurl:interview -job" search_5 = competitorName + " inurl:interview -job" search_6 = keyword + " intitle:experts interview/talk/discuss/answer" search_7 = keyword + " site:wordpress.com" search_8 = keyword + " site:blogspot.com" search_9 = keyword + " site:typepad.com" search_10 = keyword + " site:edublogs.org" search_11 = keyword + " site:livejournal.com" search_12 = keyword + " intext:powered by wordpress" search_13 = keyword + " intext:powered by typepad" search_14 = keyword + " guest blog" + " inanchor:contact" search_15 = keyword + " guest blogger" + " inanchor:contact" search_16 = keyword + " guest Column" + " inanchor:contact" search_17 = keyword + " guest article" + " inanchor:contact" search_18 = keyword + " write for us" + " inanchor:contact" search_19 = keyword + " write for me" + " inanchor:contact" search_20 = keyword + " become a contributor" + " inanchor:contact" search_21 = keyword + " contribute to this site" + " inanchor:contact" search_22 = keyword + " write for us" search_23 = keyword + " write for me" search_24 = keyword + " become a contributor" search_25 = keyword + " contribute to this site" search_26 = keyword + " inurl:category/guest" search_27 = keyword + " inurl:contributors" search_28 = keyword + " guest blog" search_29 = keyword + " guest blogger" search_30 = keyword + " guest Column" search_31 = keyword + " guest article" search_32 = keyword + " guest post" search_33 = keyword + " guest author" search_34 = keyword + " site:.gov" search_35 = keyword + " site:.gov.uk" search_36 = keyword + " site:.gov.in" search_37 = keyword + " site:.nic.in" search_38 = keyword + " site:.gov.au" search_39 = " Research Council" + " site:.state-code.us" search_40 = " Research Council" + " site:.com" search_41 = " Research Council" + " site:.uk" search_42 = " Research Council" + " site:.in" search_43 = " Research Council" + " site:.ca" search_44 = " Research Council" + " site:.state-code.au" search_45 = " Research Council" + " site:.com" search_46 = city_name + " site:.lib.state-code.us" search_47 = city_name + " library" + " site:.state-code.us" search_48 = city_name + " library" + " site:.edu" search_49 = city_name + " library" + " site:.uk" search_50 = city_name + " library" + " site:.edu" search_51 = city_name + " library" + " site:.in" search_52 = city_name + " library" + " site:.ca" search_53 = city_name + " library" + " site:.state-code.au" search_54 = city_name + " library" + " site:.edu" search_55 = city_name + " site:.cc.state-code.us" search_56 = city_name + " site:.tec.state-code.us" search_57 = city_name + " College/university" + " site:.edu" search_58 = city_name + " site:.ac.uk" search_59 = city_name + " site:.ac.in" search_60 = city_name + " College/university" + " site:.edu.in" search_61 = city_name + " site:.res.in" search_62 = city_name + " College/university" + " site:.ca" search_63 = city_name + " College/university" + " site:.com" search_64 = city_name + " College/university" + " site:.edu.state-code.au" search_65 = city_name + " College/university" + " site:.ac.nz" search_66 = city_name + " Site:.k12.state-code.us" search_67 = city_name + " Site:.pvt.k12.state-code.us" search_68 = city_name + " site:.kid" search_69 = " Site:.city_name.sch.uk" search_70 = city_name + " School" search_71 = city_name + " School" + " site:.state-code.au" search_72 = city_name + " site:.school.nz" search_73 = city_name + " Chamber of commerce site:.state-code.us" search_74 = " Chamber of commerce site:.state-code.us" search_75 = city_name + " Chamber of commerce site:.org" search_76 = city_name + " Chamber of commerce site:.com" search_77 = city_name + " Chamber of Commerce site:.uk" search_78 = city_name + " Chamber of Commerce site:.com" search_79 = city_name + " Chamber of Commerce site:.in" search_80 = city_name + " Chamber of Commerce site:.ca" search_81 = city_name + " Chamber of Commerce site:.au" search_82 = city_name + " Chamber of Commerce site:.com" search_83 = industryName + " site:meetup.com" search_84 = industryName + " intitle:conference" search_85 = industryName + " intitle:seminar" search_86 = industryName + " intitle:expo" search_87 = industryName + " intitle:trade show" search_88 = industryName + " intitle:exhibition" search_89 = industryName + " site:facebook.com" search_90 = industryName + " site:twitter.com" search_91 = industryName + " site:linkedin.com" search_92 = keyword + " intext:this is a sponsored post" search_93 = keyword + " intext:this was a sponsored post" search_94 = keyword + " intext:this is a paid post" search_95 = keyword + " intext:this was a paid post" search_96 = keyword + " intext:this is a Sponsored review" search_97 = keyword + " intext:this was a Sponsored review" search_98 = keyword + " intext:this is a paid review" search_99 = keyword + " intext:this was a paid review" search_100 = industryName + " site:quora.com" search_101 = industryName + " site:askville.amazon.com" search_102 = industryName + " site:linkedin.com/answers" search_103 = industryName + " site:wiki.answers.com" search_104 = industryName + " site:answers.yahoo.com" search_105 = ProductName + " intitle:review" search_106 = ProductName + " intitle:ratings" search_107 = ProductName + " intitle:comparison" search_108 = ProductName + " intitle:price comparison" search_109 = ProductName + " intitle:compare" search_110 = ProductName + " intitle:recommended" search_111 = ProductName + " review" + " site:livejournal.com" search_112 = ProductName + " ratings" + " site:livejournal.com" search_113 = ProductName + " comparison" + " site:livejournal.com" search_114 = ProductName + " price comparison" + " site:livejournal.com" search_115 = ProductName + " review" + " site:typepad.com" search_116 = ProductName + " ratings" + " site:typepad.com" search_117 = ProductName + " comparison" + " site:typepad.com" search_118 = ProductName + " price comparison" + " site:typepad.com" search_119 = ProductName + " recommended intext:powered by typepad" search_120 = ProductName + " review" + " site:blogspot.com" search_121 = ProductName + " ratings" + " site:blogspot.com" search_122 = ProductName + " comparison" + " site:blogspot.com" search_123 = ProductName + " price comparison" + " site:blogspot.com" search_124 = ProductName + " review" + " site:wordpress.com" search_125 = ProductName + " ratings" + " site:wordpress.com" search_126 = ProductName + " comparison" + " site:wordpress.com" search_127 = ProductName + " price comparison" + " site:wordpress.com" search_128 = ProductName + " compare" + " intext:powered by wordpress" search_129 = industryName + " top/recommended/useful/favorite/amazing/awesome tools" search_130 = industryName + " top/recommended/useful/favorite/amazing/awesome badges" search_131 = industryName + " top/recommended/useful/favorite/amazing/awesome widgets" search_132 = industryName + " top/recommended/useful/favorite/amazing/awesome infographics" search_133 = industryName + " inurl:tools" search_134 = industryName + " inurl:badges" search_135 = industryName + " inurl:widgets" search_136 = industryName + " inurl:infographics" search_137 = industryName + " intitle:tools" search_138 = industryName + " intitle:badges" search_139 = industryName + " intitle:widgets" search_140 = industryName + " intitle:infographics" search_141 = keyword + " site:.org" search_142 = keyword + " site:.org.uk" search_143 = keyword + " site:.org.in" search_144 = keyword + " site:.org.au" search_145 = " keyword" + " sweeps*" + " intitle:submit" search_146 = " keyword" + " giveaways" + " intitle:submit" search_147 = " keyword" + " coupons" + " intitle:list" search_148 = " keyword" + " coupons" + " intitle:submit/add" search_149 = " keyword" + " deals" + " intitle:submit/add" search_150 = " deals for" + " keyword *" + " intitle:submit/add" search_151 = " coupons for" + " keyword *" + " intitle:submit/add" search_152 = " Industrykeyword" + " sign up/join/register/create an account" search_153 = keyword + " directory" search_154 = keyword + " directory" + " add/submit/suggest/post" search_155 = keyword + " intitle:directory" search_156 = keyword + " inurl:directory" search_157 = keyword + " Listings" search_158 = keyword + " add your business/list your business" search_159 = keyword + " intitle:add/submit/suggest/post/list/recommend article" search_160 = keyword + " intitle:add/submit/suggest/post/list/recommend video" search_161 = keyword + " intitle:add/submit/suggest/post/list/recommend podcast" search_162 = keyword + " intitle:add/submit/suggest/post/list/recommend whitepaper" search_163 = keyword + " intitle:add/submit/suggest/post/list/recommend webinars" search_164 = keyword + " intitle:add/submit/suggest/post/list/recommend event" search_165 = keyword + " intitle:add/submit/suggest/post/list job" search_166 = keyword + " intitle:add/submit/suggest/post/list contest" search_167 = keyword + " intitle:add/submit/post/list coupons" search_168 = keyword + " add/submit/suggest/post/list/recommend article" search_169 = keyword + " add/submit/suggest/post/list/recommend video" search_170 = keyword + " add/submit/suggest/post/list/recommend podcast" search_171 = keyword + " add/submit/suggest/post/list/recommend whitepaper" search_172 = keyword + " add/submit/suggest/post/list/recommend webinars" search_173 = keyword + " add/submit/suggest/post/list/recommend event" search_174 = keyword + " add/submit/suggest/post/list job" search_175 = keyword + " add/submit/suggest/post/list contest" search_176 = keyword + " add/submit/post/list coupons" search_177 = keyword + " add a site/submit site/suggest site/post site/recommend site" search_178 = keyword + " add URL/submit URL/suggest URL/post URL/recommend URL" search_179 = keyword + " add listing/submit listing/suggest listing/post listing/recommend listing" search_180 = keyword + " inurl:links/resources" search_181 = keyword + " whitepapers" search_182 = keyword + " videos" search_183 = keyword + " podcasts" search_184 = keyword + " research" search_185 = keyword + " site:.edu" search_186 = keyword + " site:.info" search_187 = keyword + " filetype:doc/docx/xls/ppt/pdf" search_188 = keyword + " news/industry news" search_189 = keyword + " magazine/industry magazine" search_190 = keyword + " journal/industry journal" search_191 = " list of" + keyword + " sites" search_192 = keyword + " intitle:resources" search_193 = keyword + " round up" search_194 = keyword + " intitle:round up" search_195 = keyword + " round up" + " intitle:weekly/daily/monthly" search_196 = keyword + " intitle:list" search_197 = keyword + " guide" search_198 = keyword + " recommended links/suggested links" search_199 = keyword + " useful links/interesting links" search_200 = keyword + " favorite links" search_201 = keyword + " recommended tools/suggested tools" search_202 = keyword + " useful tools/interesting tools" search_203 = keyword + " favorite tools" search_204 = keyword + " recommended articles/suggested articles" search_205 = keyword + " useful articles/interesting articles" search_206 = keyword + " favorite articles" search_207 = keyword + " recommended resources/suggested resources" search_208 = keyword + " useful resources/interesting resources" search_209 = keyword + " favorite resources" search_210 = keyword + " recommended sites/suggested sites" search_211 = keyword + " useful sites/interesting sites" search_212 = keyword + " favorite sites" search_213 = keyword + " recommended websites/suggested websites" search_214 = keyword + " useful websites/interesting websites" search_215 = keyword + " favorite websites" search_216 = keyword + " top 10 resources/top resources" search_217 = keyword + " top 10 sites/top sites" search_218 = keyword + " top 10 websites/top websites" search_219 = keyword + " top 10 articles/ top articles" search_220 = keyword + " top 10 tools/top tools" search_221 = keyword + " top 10 web resources/top web resources" search_222 = keyword + " top 10 internet resources/top internet resources" search_223 = keyword + " top 10 online resources/top online resources" search_224 = keyword + " guest blog" + " inanchor:contact" search_225 = keyword + " guest blogger" + " inanchor:contact" search_226 = keyword + " guest Column" + " inanchor:contact" search_227 = keyword + " guest article" + " inanchor:contact" search_228 = keyword + " write for us" + " inanchor:contact" search_229 = keyword + " write for me" + " inanchor:contact" search_230 = keyword + " become a contributor" + " inanchor:contact" search_231 = keyword + " contribute to this site" + " inanchor:contact" search_232 = keyword + " write for us" search_233 = keyword + " write for me" search_234 = keyword + " become a contributor" search_235 = keyword + " contribute to this site" search_236 = keyword + " inurl:category/guest" search_237 = keyword + " inurl:contributors" search_238 = keyword + " guest blog" search_239 = keyword + " guest blogger" search_240 = keyword + " guest Column" search_241 = keyword + " guest article" search_242 = keyword + " guest post" search_243 = keyword + " guest author" search_array = [search_1,search_2,search_3,search_4,search_5,search_6,search_7,search_8,search_9,search_10,search_11,search_12,search_13,search_14,search_15,search_16,search_17,search_18,search_19,search_20,search_21,search_22,search_23,search_24,search_25,search_26,search_27,search_28,search_29,search_30,search_31,search_32,search_33,search_34,search_35,search_36,search_37,search_38,search_39,search_40,search_41,search_42,search_43,search_44,search_45,search_46,search_47,search_48,search_49,search_50,search_51,search_52,search_53,search_54,search_55,search_56,search_57,search_58,search_59,search_60,search_61,search_62,search_63,search_64,search_65,search_66,search_67,search_68,search_69,search_70,search_71,search_72,search_73,search_74,search_75,search_76,search_77,search_78,search_79,search_80,search_81,search_82,search_83,search_84,search_85,search_86,search_87,search_88,search_89,search_90,search_91,search_92,search_93,search_94,search_95,search_96,search_97,search_98,search_99,search_100,search_101,search_102,search_103,search_104,search_105,search_106,search_107,search_108,search_109,search_110,search_111,search_112,search_113,search_114,search_115,search_116,search_117,search_118,search_119,search_120,search_121,search_122,search_123,search_124,search_125,search_126,search_127,search_128,search_129,search_130,search_131,search_132,search_133,search_134,search_135,search_136,search_137,search_138,search_139,search_140,search_141,search_142,search_143,search_144,search_145,search_146,search_147,search_148,search_149,search_150,search_151,search_152,search_153,search_154,search_155,search_156,search_157,search_158,search_159,search_160,search_161,search_162,search_163,search_164,search_165,search_166,search_167,search_168,search_169,search_170,search_171,search_172,search_173,search_174,search_175,search_176,search_177,search_178,search_179,search_180,search_181,search_182,search_183,search_184,search_185,search_186,search_187,search_188,search_189,search_190,search_191,search_192,search_193,search_194,search_195,search_196,search_197,search_198,search_199,search_200,search_201,search_202,search_203,search_204,search_205,search_206,search_207,search_208,search_209,search_210,search_211,search_212,search_213,search_214,search_215,search_216,search_217,search_218,search_219,search_220,search_221,search_222,search_223,search_224,search_225,search_226,search_227,search_228,search_229,search_230,search_231,search_232,search_233,search_234,search_235,search_236,search_237,search_238,search_239,search_240,search_241,search_242,search_243]
I wrote a quick script that turned the blog post into the script above so that we could use it in our code while also using the variables we care about.
In my Python script, I pulled the text from the blog post, scraped it, and then pulled only the paragraphs that included a plus sign. I then wrote the Python script above that declared all my variables and put them in an array so that we could iterate through them.
import requests import bs4 import pandas as pd url = 'https://www.optimizesmart.com/10000-search-engine-queries-for-your-link-building-campaign/' response = requests.get(url).text print(response) searches = pd.DataFrame() soup = bs4.BeautifulSoup(response, 'html.parser') paragraphs = soup.find_all('p') for paragraph in paragraphs: new_row = pd.DataFrame() text = paragraph.text if text.find('+') != -1 or text.find('site:') != -1: new_row['Search'] = pd.Series(paragraph.text) print(paragraph.text) frames = [new_row, searches] searches = pd.concat(frames, sort=False, ignore_index=True) for index, row in searches.iterrows(): search = row['Search'] if search.find('/yourCompetitorName') != -1: new_row = pd.DataFrame() search.replace('Keyword/', '') new_row['Search'] = pd.Series(search) frames = [searches, new_row] searches = pd.concat(frames, sort=False, ignore_index=True) searches['Search'] = searches['Search'].str.replace('/yourCompetitorName', '') pd.options.display.width = 0 print(searches) f = open("google_search.py", "a") iterator = 1 variable_list = ['city_name', 'Keyword', 'industryName', 'competitorName', 'cityName', 'ProductName', 'yourCompetitorName'] array_encoding_statement = 'search_array = [' f.write("Keyword = 'landing page builder'\n") f.write("industryName = 'MarTech'\n") f.write("competitorName = 'LandingI'\n") f.write("yourCompetitorName = 'LandingI'\n") f.write("keyword = 'landing page builder'\n") f.write("industryVertical = 'MarTech'\n") f.write("city_name = 'Phoenix'\n") f.write("ProductName = 'landing page builder'\n") for index, row in searches.iterrows(): search = row['Search'] print(search) search = search.replace('"', '') search = search.replace("'", "") search = search.replace('“', '') search = search.replace('”', '') search = search.replace('<', '') search = search.replace('>', '') search = search.replace('city-name', 'city_name') search_split = search.split('+') search = '' for split in search_split: if split.strip() in variable_list: search += split.strip() + '+' else: search += '" ' + split.strip() + '"' + '+' search = search[:-1] f.write("search_" + str(iterator) + " = " + search + '\n') array_encoding_statement += "search_" + str(iterator) + ',' iterator += 1 array_encoding_statement = array_encoding_statement[:-1] array_encoding_statement += ']' f.write(array_encoding_statement + '\n') f.write('for search in search_array:\n') f.write('\tprint(search)\n') f.close()
Now that you have the list of search queries, we could search one specific query and export it as a test or you can pull all of the data in one go with a for loop.
Pulling one search can be done by the bottom code.
search_link_building_query(search_1).to_csv('link-building-test.csv', index=False)
If you want to pull all of the data, you can use a for loop and concatenate every result. Keep in mind that every call of the method is considered one search and 1,000 searches cost $5.
To pull all of the domains, simply run the for loop below.
all_domains = pd.DataFrame()
for search_term in search_array:
domains = search_link_building_query(search_term)
frames = [domains, all_domains]
all_domains = pd.concat(frames, sort=False, ignore_index=True)
all_domains.to_csv('all_link_building_info.csv', index=False)
You can use this script to automate your link building research and easily identify the blogs to reach out to. You could go even a step further and find the email automatically using this email sourcing script. You could then reach out with a summary of the article in your own words and sell to them.
Copyright 2021 Salestream LLC Sitemap