Does When You Were Born Affect Your Chance of Becoming a Nobel Laureate? Scraping Wikipedia to Find Out

There has been a lot of talk in the UK recently about whether when you were born affects your schooling. Lots of teachers have noticed how pupils born at the end of the Summer often struggle compared with those born in the Autumn, which makes sense because the latter group are almost a year older when they start school than the former. However, teachers are not the only ones who think that when you were born affects your future. Astrologers base much of what they do on when a person was born.

To see if when you were born does affect your future, I have decided to look at various groups over a series of articles. For this, the first article, I am beginning with Nobel laureates, as they come from many different countries, represent excellence in their field and they are reasonably well documented on wikipedia. The data for the study was collected by using ScraperWiki to build a series of scrapers and views to go through the List of Nobel laureates and find the Date of Birth for each person.

The Findings

The dates of birth were collated and from this frequency charts were constructed for the months of birth and star signs of the Nobel laureates. The findings are illustrated below.

Distribution of Months of Birth Among Nobel Prize Winners

There is not too much difference here between the months, although June does stand out as having significantly more Nobel laureates than other months, and in fact is 3.75% ahead of the lowest month, January.

February is interesting because it would be expected to be a bit lower since it has less days. If you look at February's lowest number of days compared to the highest number of days in other months you will see that it can have as little as 90% of the days (28/31 = 0.9032). This is enough to account for its low percentage as if you take March with 31 days and take 90% of its figure you would get 7.74% (8.57*0.9032), which is a little low, but not markedly so.

Month of BirthFrequencyPercent
January506.69
February445.89
March648.57
April577.63
May638.43
June7810.44
July648.57
August689.10
September668.84
October699.24
November608.03
December648.57

Sample Size: 747

Distribution of Star Signs Among Nobel Prize Winners

The difference in the distribution of star signs among Nobel laureates seems to be much greater then the distribution of the months in which they were born. It is quite clear here that Gemini and Libra stand out from the others, particularly when compared to Capricorn and Aquarius where the greatest difference is 5.09%.

Star SignFrequencyPercentDates
Aries648.5721 March - 19 April
Taurus618.1720 April - 20 May
Gemini7810.4421 May - 20 June
Cancer729.6421 June - 22 July
Leo547.2323 July - 22 August
Virgo729.6423 August - 22 September
Libra8010.7123 September - 22 October
Scorpio567.5023 October - 21 November
Sagittarius608.0322 November - 21 December
Capricorn425.6222 December - 19 January
Aquarius476.2920 January - 18 February
Pisces567.5019 February - 20 March

Sample Size: 747

Problems With The Study

  • The distribution of months of birth and star signs could just represent the normal distribution for that population and therefore should be compared to non prize winners.
  • The list is dominated by Europeans. Therefore there will be some similar conditions, such as weather patterns, although school terms will be different where relevant.
  • A few of the laureates didn't have an accurate date of birth, and were therefore excluded.
  • The sample size is relatively small.

Conclusion

There does seem to be some variance between the birth periods and interestingly this seems to be more pronounced for star signs than for months of birth. In particular, Geminis and Libras or people born in June do stand out as being more likely to receive a Nobel prize, whereas Capricorns and Aquariuses or people born in January or February are less likely to receive a Nobel Prize.

Other Articles in the Series


Scrapers and Views

For those interested, links to the views and the code for the scrapers is listed below. The code is current at the time of writing, but may have changed since, so please go to the original source to see the latest versions.

Nobel Prize Winners Names and Wiki Urls

This was the first stage. The scraper was used to compile a database of Nobel laureates and links to their pages on Wikipedia. The original scraper is to be found on ScraperWiki: Nobel Prize Winners Names and Wiki Urls

require 'nokogiri'

html = ScraperWiki.scrape("http://en.wikipedia.org/wiki/Nobel_prize_winners")

winners = {}
doc = Nokogiri::HTML(html)
doc.css('table.wikitable td span.fn a').each do |a|
  name = a.inner_text
  wiki_url = a.attribute('href')
  absolute_url = "http://wikipedia.org#{wiki_url}"
  winners[absolute_url] = name
end

# Save data to database
winners.each do |url, name|
  data = {
    'url' => url,
    'name' => name
  }
  ScraperWiki.save_sqlite(unique_keys=['url'], data=data)
end

Nobel Prize Winners' DOB

The next stage was to scrape the Wikipedia page of each person and get their Date of Birth. The original scraper is to be found on ScraperWiki: Nobel Prize Winners' DOB

require 'date'
require 'nokogiri'

module StarSign
  # Dates from: http://my.horoscope.com/astrology/horoscope-sign-index.html
  STAR_SIGN_DATES = {
    'aries' =>       ['21 March 2011', '19 April 2011'],
    'taurus' =>      ['20 April 2011', '20 May 2011'],
    'gemini' =>      ['21 May 2011', '20 June 2011'],
    'cancer' =>      ['21 June 2011', '22 July 2011'],
    'leo' =>         ['23 July 2011', '22 August 2011'],
    'virgo' =>       ['23 August 2011', '22 September 2011'],
    'libra' =>       ['23 September 2011', '22 October 2011'],
    'scorpio' =>     ['23 October 2011', '21 November 2011'],
    'sagittarius' => ['22 November 2011', '21 December 2011'],
    'capricorn' =>   ['22 December 2011', '19 January 2012'],
    'aquarius' =>    ['20 January 2011', '18 February 2011'],
    'pisces' =>      ['19 February 2011', '20 March 2011']
  }

  def star_sign
    compare_date = Date.parse(self.to_s.sub(/^\d+-/, "2011-"))
    STAR_SIGN_DATES.each do |sign, dates|
      if compare_date >= Date.parse(dates[0]) &&
         compare_date <= Date.parse(dates[1])
        return sign
      end
    end
    # FIX:  It has to be capricorn here, the problem is due to the years
    return 'capricorn'
  end
end

class Date
  include StarSign
end

class DOBScraper
  attr_reader :population

  def initialize(dob_database)
    ScraperWiki.attach(dob_database)
    @population = prize_winners = ScraperWiki.select(
      "name, url from nobel_prize_winners_names_and_wiki_urls.swdata
       order by name"
    )
    @last_saved_name = ScraperWiki.get_var('last_saved_name')
  end

  def dump_dob(name, dob, star_sign)
    data = {
      'name' => name,
      'dob' => dob,
      'star_sign' => star_sign
    }

    ScraperWiki.save_sqlite(unique_keys=['name'], data=data)
    ScraperWiki.save_var('last_saved_name', name)
    @last_saved_name = name
  end

  def extract_dob(person)
    name,url = person['name'], person['url']
    begin
      html = ScraperWiki.scrape(url)
    rescue StandardError => error
      puts "Error: #{error} (url: #{url})"
    end

    doc = Nokogiri::HTML(html)
    doc.css('table.infobox th').each do |th|
      if th.inner_text == "Born"
        born = th.parent.at('td').inner_text
        dob = born.scan(/.*?1[6789]\d\d/).first
        begin
          star_sign = Date.parse(dob).star_sign
          dump_dob(name, dob, star_sign)
        rescue StandardError => error
          puts "Error: #{error} dob: #{dob} (name: #{name} url: #{url})"
        end

      end
    end

  end

  def skip_person?(name)
    return false unless @last_saved_name
    name_index = @population.find_index{|winner| winner['name'] == name}
    last_saved_index = @population.find_index{|winner| winner['name'] == @last_saved_name}
    last_saved_index >= name_index && last_saved_index != @population.size-1
  end

  def scrape
    @population.each do |person|
      unless skip_person?(person['name'])
        extract_dob(person)
      end
    end
  end
end

dob_scraper = DOBScraper.new('nobel_prize_winners_names_and_wiki_urls')
dob_scraper.scrape

Nobel Prize Winners' Star Sign and Month of Birth Views

To visualise the results of the scraping I created a couple of views. I have decided not to include the code for these here as they are quite long and would be better off linked to. They can again be found on ScraperWiki: Nobel Prize Winners MOB and Nobel Prize Winners' Star Signs

Creative Commons License
Does When You Were Born Affect Your Chance of Becoming a Nobel Laureate? Scraping Wikipedia to Find Out by Lawrence Woodman is licensed under a Creative Commons Attribution 4.0 International License.

Share This Post

Feedback/Discuss

Related Articles

Pisceans and October Babies More Likely to Become Poets. Scraping Wikipedia Reveals All

This is the second in a series of articles looking into whether when you were born affects your future. In the previous article I looked at Nobel laureates, which are, of course, from a range of field...   Read More

Improving the related_posts feature of jekyll

Now that I have converted TechTinkering over to Jekyll, I have come up against a bit of a problem with site.related_posts: The results are always just the latest posts, and are not filtered or ordered ...   Read More

Mida - A Microdata parser/extractor library for Ruby

I have recently released Mida as a Gem for parsing/extracting Microdata from web pages. Not many sites at the moment are using Microdata, in fact, apart from this site, I only know of one other: Trust...   Read More

A Jekyll Plugin to Display Ratings as Star Images

I have been using Jekyll a lot recently on the Trust a Friend website and found the need to display a rating as a series of stars. Initially I implemented this in JavaScript, which worked fine, but I ...   Read More