There has been a lot of talk in the UK recently about whether when you were born affects your schooling. Lots of teachers have noticed how pupils born at the end of the Summer often struggle compared with those born in the Autumn, which makes sense because the latter group are almost a year older when they start school than the former. However, teachers are not the only ones who think that when you were born affects your future. Astrologers base much of what they do on when a person was born.

To see if when you were born does affect your future, I have decided to look at various groups over a series of articles. For this, the first article, I am beginning with Nobel laureates, as they come from many different countries, represent excellence in their field and they are reasonably well documented on wikipedia. The data for the study was collected by using ScraperWiki to build a series of scrapers and views to go through the List of Nobel laureates and find the Date of Birth for each person.

The Findings

The dates of birth were collated and from this frequency charts were constructed for the months of birth and star signs of the Nobel laureates. The findings are illustrated below.

Distribution of Months of Birth Among Nobel Prize Winners

There is not too much difference here between the months, although June does stand out as having significantly more Nobel laureates than other months, and in fact is 3.75% ahead of the lowest month, January.

February is interesting because it would be expected to be a bit lower since it has less days. If you look at February’s lowest number of days compared to the highest number of days in other months you will see that it can have as little as 90% of the days (28/31 = 0.9032). This is enough to account for its low percentage as if you take March with 31 days and take 90% of its figure you would get 7.74% (8.57*0.9032), which is a little low, but not markedly so.

Month of BirthFrequencyPercent
January506.69
February445.89
March648.57
April577.63
May638.43
June7810.44
July648.57
August689.10
September668.84
October699.24
November608.03
December648.57

Sample Size: 747

Distribution of Star Signs Among Nobel Prize Winners

The difference in the distribution of star signs among Nobel laureates seems to be much greater then the distribution of the months in which they were born. It is quite clear here that Gemini and Libra stand out from the others, particularly when compared to Capricorn and Aquarius where the greatest difference is 5.09%.

Star SignFrequencyPercentDates
Aries648.5721 March - 19 April
Taurus618.1720 April - 20 May
Gemini7810.4421 May - 20 June
Cancer729.6421 June - 22 July
Leo547.2323 July - 22 August
Virgo729.6423 August - 22 September
Libra8010.7123 September - 22 October
Scorpio567.5023 October - 21 November
Sagittarius608.0322 November - 21 December
Capricorn425.6222 December - 19 January
Aquarius476.2920 January - 18 February
Pisces567.5019 February - 20 March

Sample Size: 747

Problems With The Study

  • The distribution of months of birth and star signs could just represent the normal distribution for that population and therefore should be compared to non prize winners.
  • The list is dominated by Europeans. Therefore there will be some similar conditions, such as weather patterns, although school terms will be different where relevant.
  • A few of the laureates didn’t have an accurate date of birth, and were therefore excluded.
  • The sample size is relatively small.

Conclusion

There does seem to be some variance between the birth periods and interestingly this seems to be more pronounced for star signs than for months of birth. In particular, Geminis and Libras or people born in June do stand out as being more likely to receive a Nobel prize, whereas Capricorns and Aquariuses or people born in January or February are less likely to receive a Nobel Prize.

Commissions

This study highlights the power of scraping the web to extract these sort of statistics and given the time, this could be extended to increase confidence in the data and draw more accurate conclusions. If you would like to commission, vLife Systems, to create a scraper which will extract data from websites or other data sources of interest to you, please get in touch via email: info@vlifesystems.com.

Scrapers and Views

For those interested, links to the views and the code for the scrapers is listed below. The code is current at the time of writing, but may have changed since, so please go to the original source to see the latest versions.

Nobel Prize Winners Names and Wiki Urls

This was the first stage. The scraper was used to compile a database of Nobel laureates and links to their pages on Wikipedia. The original scraper is to be found on ScraperWiki: Nobel Prize Winners Names and Wiki Urls

require 'nokogiri' 

html = ScraperWiki.scrape("http://en.wikipedia.org/wiki/Nobel_prize_winners")

winners = {}
doc = Nokogiri::HTML(html)
doc.css('table.wikitable td span.fn a').each do |a|
  name = a.inner_text
  wiki_url = a.attribute('href')
  absolute_url = "http://wikipedia.org#{wiki_url}"
  winners[absolute_url] = name
end

# Save data to database
winners.each do |url, name|
  data = {
    'url' => url,
    'name' => name
  }
  ScraperWiki.save_sqlite(unique_keys=['url'], data=data)
end

Nobel Prize Winners’ DOB

The next stage was to scrape the Wikipedia page of each person and get their Date of Birth. The original scraper is to be found on ScraperWiki: Nobel Prize Winners’ DOB

require 'date'
require 'nokogiri'

module StarSign
  # Dates from: http://my.horoscope.com/astrology/horoscope-sign-index.html
  STAR_SIGN_DATES = {
    'aries' =>       ['21 March 2011', '19 April 2011'],
    'taurus' =>      ['20 April 2011', '20 May 2011'],
    'gemini' =>      ['21 May 2011', '20 June 2011'],
    'cancer' =>      ['21 June 2011', '22 July 2011'],
    'leo' =>         ['23 July 2011', '22 August 2011'],
    'virgo' =>       ['23 August 2011', '22 September 2011'],
    'libra' =>       ['23 September 2011', '22 October 2011'],
    'scorpio' =>     ['23 October 2011', '21 November 2011'],
    'sagittarius' => ['22 November 2011', '21 December 2011'],
    'capricorn' =>   ['22 December 2011', '19 January 2012'],
    'aquarius' =>    ['20 January 2011', '18 February 2011'],
    'pisces' =>      ['19 February 2011', '20 March 2011']
  }

  def star_sign
    compare_date = Date.parse(self.to_s.sub(/^\d+-/, "2011-"))
    STAR_SIGN_DATES.each do |sign, dates|
      if compare_date >= Date.parse(dates[0]) &&
         compare_date <= Date.parse(dates[1])
        return sign
      end
    end
    # FIX:  It has to be capricorn here, the problem is due to the years
    return 'capricorn'
  end
end

class Date
  include StarSign
end

class DOBScraper
  attr_reader :population

  def initialize(dob_database)
    ScraperWiki.attach(dob_database) 
    @population = prize_winners = ScraperWiki.select(           
      "name, url from nobel_prize_winners_names_and_wiki_urls.swdata 
       order by name"
    )
    @last_saved_name = ScraperWiki.get_var('last_saved_name') 
  end

  def dump_dob(name, dob, star_sign)
    data = {
      'name' => name,
      'dob' => dob,
      'star_sign' => star_sign
    }

    ScraperWiki.save_sqlite(unique_keys=['name'], data=data)
    ScraperWiki.save_var('last_saved_name', name)
    @last_saved_name = name
  end

  def extract_dob(person)
    name,url = person['name'], person['url']
    begin
      html = ScraperWiki.scrape(url)
    rescue StandardError => error
      puts "Error: #{error} (url: #{url})"
    end

    doc = Nokogiri::HTML(html)
    doc.css('table.infobox th').each do |th|
      if th.inner_text == "Born"
        born = th.parent.at('td').inner_text
        dob = born.scan(/.*?1[6789]\d\d/).first
        begin
          star_sign = Date.parse(dob).star_sign
          dump_dob(name, dob, star_sign)
        rescue StandardError => error
          puts "Error: #{error} dob: #{dob} (name: #{name} url: #{url})"
        end
        
      end
    end
  
  end

  def skip_person?(name)
    return false unless @last_saved_name
    name_index = @population.find_index{|winner| winner['name'] == name}
    last_saved_index = @population.find_index{|winner| winner['name'] == @last_saved_name}
    last_saved_index >= name_index && last_saved_index != @population.size-1
  end

  def scrape
    @population.each do |person|
      unless skip_person?(person['name'])
        extract_dob(person)
      end
    end
  end
end

dob_scraper = DOBScraper.new('nobel_prize_winners_names_and_wiki_urls')
dob_scraper.scrape

Nobel Prize Winners’ Star Sign and Month of Birth Views

To visualise the results of the scraping I created a couple of views. I have decided not to include the code for these here as they are quite long and would be better off linked to. They can again be found on ScraperWiki: Nobel Prize Winners MOB and Nobel Prize Winners’ Star Signs