In [1]:
import pymongo
import datetime
import collections
from numpy import nan as NA
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
#import seaborn as sns
import folium
import numpy as np
import pandas as pd
import scipy.stats
import psycopg2 as pg
import pandas as pd
import pandas.io.sql as psqlg
import matplotlib.patches as mpatches
from itertools import cycle, islice

Questions

If I classify the species ‘bat’ into a specific species based on location data? I could have this as one question and the other one could be a comparison of numbers of hibernating and roosting bats to ascertain if they follow similar patterns to determine if some types of bats survive hibernation better than others?

Question 1 Classification was carried out using the leave-one-out algorithm in 'LeaveOneOut' notebook and the K Nearest Neighbours classifier in the KNN notebook. The value of k which produced the best results was 1 using the leave-one-out algorithm, but this is not thought to be reliable and as such the value of 3 was selected based on manual calculation of optimum euclidean distance in the KNN notebook.

08/08/2016 Question 1 have the bat numbers over time increased by a healthy percentage. Question 2 have the bat numbers of the rare bat increased or decreased. Have the changes made by combining the datasets where commonname = 'bat' and re-classifying them made a difference to the results.

At present we are able to produce Roost Count trends for common pipistrelle, soprano pipistrelle, serotine, Natterer's bat, brown long-eared bat, lesser horseshoe bat and greater horseshoe bat

In [2]:
PrioritySpecies = ['Lesser Horseshoe Bat','Greater Horseshoe Bat', 'Soprano Pipistrelle','Brown Long-eared Bat']
#These bats are the priority species reported in visualisation from the report:
#The state of the UK’s bats 2014, National Bat Monitoring Programme Population Trends,(The Bat Conservation Trust, 2014)
In [3]:
OtherSpecies = ['Common Pipistrelle', 'Serotine', "'Natterer's Bat'", "'Daubenton's bat'",'Noctule',"'Brandt's bat'", 
                "'Whiskered/Brandt's bat'"]
#These bats are the additional species reported in visualisation from the report:
#The state of the UK’s bats 2014, National Bat Monitoring Programme Population Trends,(The Bat Conservation Trust, 2014)

Greater horseshoe bat* (Roost Count) significant increase

Lesser horseshoe bat* (Hibernation Survey & Roost Count) We conclude that the lesser horseshoe bat population is increasing

Daubenton’s bat (Hibernation Survey) Daubenton's bat is showing a small but significant increase

Common pipistrelle (Field Survey) We conclude that the common pipistrelle population is increasing from the upward trend seen in this species in the Field Survey

whiskered/Brandt’s bat, soprano pipistrelle, noctule, serotine and brown long-eared bat*; none of these showed significant changes over the monitoring period to 2013.

At present, there are insufficient data available for the other six UK breeding bat species (Bechstein’s bat, Alcathoe bat, Leisler’s bat, Nathusius’ pipistrelle, barbastelle and grey long-eared bat) to allow estimation of population trends

Grey long-eared bat surveillance We are always striving to improve our knowledge of the distribution of bat species, particularly where information is lacking or patchy. The grey longeared bat is one of the UK’s rarest bat species - 1,000 individuals and a distribution that is restricted mainly to the southern coast of England and Wales.

Reading in the cleaned and classified data. For details of the cleaning process please see R0050097_project_diary.

In [4]:
#Connecting to the database
%load_ext sql
%sql postgresql://test:test@localhost:5432/tm351test
Out[4]:
'Connected: test@tm351test'

Connecting to the database

In [5]:
conn = pg.connect(dbname='tm351test', host='localhost', user='test', password='test', port=5432)

Checking the tables in the tm351test schema.

In [6]:
%%sql 
SELECT * FROM information_schema.tables
158 rows affected.
Out[6]:
table_catalog table_schema table_name table_type self_referencing_column_name reference_generation user_defined_type_catalog user_defined_type_schema user_defined_type_name is_insertable_into is_typed commit_action
tm351test pg_catalog pg_type BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_roles VIEW None None None None None NO NO None
tm351test pg_catalog pg_group VIEW None None None None None NO NO None
tm351test pg_catalog pg_user VIEW None None None None None NO NO None
tm351test pg_catalog pg_rules VIEW None None None None None NO NO None
tm351test pg_catalog pg_views VIEW None None None None None NO NO None
tm351test pg_catalog pg_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_matviews VIEW None None None None None NO NO None
tm351test pg_catalog pg_indexes VIEW None None None None None NO NO None
tm351test pg_catalog pg_locks VIEW None None None None None NO NO None
tm351test pg_catalog pg_settings VIEW None None None None None NO NO None
tm351test pg_catalog pg_cursors VIEW None None None None None NO NO None
tm351test pg_catalog pg_available_extensions VIEW None None None None None NO NO None
tm351test pg_catalog pg_available_extension_versions VIEW None None None None None NO NO None
tm351test pg_catalog pg_prepared_xacts VIEW None None None None None NO NO None
tm351test pg_catalog pg_prepared_statements VIEW None None None None None NO NO None
tm351test pg_catalog pg_seclabels VIEW None None None None None NO NO None
tm351test pg_catalog pg_timezone_abbrevs VIEW None None None None None NO NO None
tm351test pg_catalog pg_timezone_names VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_all_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_xact_all_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_sys_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_xact_sys_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_user_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_xact_user_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_all_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_sys_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_user_tables VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_all_indexes VIEW None None None None None NO NO None
tm351test pg_catalog pg_proc BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_statio_sys_indexes VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_user_indexes VIEW None None None None None NO NO None
tm351test pg_catalog pg_class BASE TABLE None None None None None YES NO None
tm351test public hibernationcoords BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_statio_all_sequences VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_sys_sequences VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_user_sequences VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_activity VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_replication VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_database VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_database_conflicts VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_user_functions VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_xact_user_functions VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_bgwriter VIEW None None None None None NO NO None
tm351test pg_catalog pg_user_mappings VIEW None None None None None NO NO None
tm351test pg_catalog pg_attribute BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_constraint BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_inherits BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_index BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_operator BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_opfamily BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_opclass BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_am BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_amop BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_amproc BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_language BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_largeobject_metadata BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_aggregate BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_database BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_stats VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_sys_indexes VIEW None None None None None NO NO None
tm351test pg_catalog pg_stat_user_indexes VIEW None None None None None NO NO None
tm351test pg_catalog pg_statio_all_indexes VIEW None None None None None NO NO None
tm351test information_schema information_schema_catalog_name VIEW None None None None None NO NO None
tm351test information_schema applicable_roles VIEW None None None None None NO NO None
tm351test information_schema administrable_role_authorizations VIEW None None None None None NO NO None
tm351test information_schema attributes VIEW None None None None None NO NO None
tm351test pg_catalog pg_trigger BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_event_trigger BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_description BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_cast BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_enum BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_conversion BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_depend BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_db_role_setting BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_tablespace BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_pltemplate BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_auth_members BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_shdepend BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_shdescription BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_ts_config BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_ts_config_map BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_ts_dict BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_ts_parser BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_ts_template BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_extension BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_foreign_data_wrapper BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_foreign_server BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_foreign_table BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_default_acl BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_seclabel BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_shseclabel BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_collation BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_range BASE TABLE None None None None None YES NO None
tm351test information_schema character_sets VIEW None None None None None NO NO None
tm351test information_schema check_constraint_routine_usage VIEW None None None None None NO NO None
tm351test information_schema check_constraints VIEW None None None None None NO NO None
tm351test information_schema collations VIEW None None None None None NO NO None
tm351test information_schema collation_character_set_applicability VIEW None None None None None NO NO None
tm351test information_schema column_domain_usage VIEW None None None None None NO NO None
tm351test information_schema column_privileges VIEW None None None None None NO NO None
tm351test information_schema column_udt_usage VIEW None None None None None NO NO None
tm351test information_schema columns VIEW None None None None None NO NO None
tm351test information_schema constraint_column_usage VIEW None None None None None NO NO None
tm351test information_schema constraint_table_usage VIEW None None None None None NO NO None
tm351test information_schema domain_constraints VIEW None None None None None NO NO None
tm351test information_schema domain_udt_usage VIEW None None None None None NO NO None
tm351test information_schema domains VIEW None None None None None NO NO None
tm351test information_schema enabled_roles VIEW None None None None None NO NO None
tm351test information_schema key_column_usage VIEW None None None None None NO NO None
tm351test information_schema parameters VIEW None None None None None NO NO None
tm351test information_schema referential_constraints VIEW None None None None None NO NO None
tm351test information_schema role_column_grants VIEW None None None None None NO NO None
tm351test information_schema routine_privileges VIEW None None None None None NO NO None
tm351test information_schema role_routine_grants VIEW None None None None None NO NO None
tm351test information_schema routines VIEW None None None None None NO NO None
tm351test information_schema schemata VIEW None None None None None NO NO None
tm351test information_schema sequences VIEW None None None None None NO NO None
tm351test information_schema sql_implementation_info BASE TABLE None None None None None YES NO None
tm351test information_schema sql_languages BASE TABLE None None None None None YES NO None
tm351test information_schema sql_packages BASE TABLE None None None None None YES NO None
tm351test information_schema table_constraints VIEW None None None None None NO NO None
tm351test information_schema table_privileges VIEW None None None None None NO NO None
tm351test information_schema role_table_grants VIEW None None None None None NO NO None
tm351test information_schema tables VIEW None None None None None NO NO None
tm351test information_schema triggered_update_columns VIEW None None None None None NO NO None
tm351test information_schema triggers VIEW None None None None None NO NO None
tm351test information_schema udt_privileges VIEW None None None None None NO NO None
tm351test information_schema role_udt_grants VIEW None None None None None NO NO None
tm351test information_schema usage_privileges VIEW None None None None None NO NO None
tm351test information_schema role_usage_grants VIEW None None None None None NO NO None
tm351test information_schema user_defined_types VIEW None None None None None NO NO None
tm351test information_schema view_column_usage VIEW None None None None None NO NO None
tm351test information_schema view_routine_usage VIEW None None None None None NO NO None
tm351test information_schema view_table_usage VIEW None None None None None NO NO None
tm351test information_schema views VIEW None None None None None NO NO None
tm351test information_schema data_type_privileges VIEW None None None None None NO NO None
tm351test information_schema element_types VIEW None None None None None NO NO None
tm351test information_schema column_options VIEW None None None None None NO NO None
tm351test information_schema foreign_data_wrapper_options VIEW None None None None None NO NO None
tm351test information_schema foreign_data_wrappers VIEW None None None None None NO NO None
tm351test information_schema foreign_server_options VIEW None None None None None NO NO None
tm351test information_schema foreign_servers VIEW None None None None None NO NO None
tm351test information_schema foreign_table_options VIEW None None None None None NO NO None
tm351test information_schema foreign_tables VIEW None None None None None NO NO None
tm351test information_schema user_mapping_options VIEW None None None None None NO NO None
tm351test information_schema user_mappings VIEW None None None None None NO NO None
tm351test information_schema sql_sizing_profiles BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_attrdef BASE TABLE None None None None None YES NO None
tm351test public classifiedcombineddata BASE TABLE None None None None None YES NO None
tm351test public hibernationbats BASE TABLE None None None None None YES NO None
tm351test public roostbats BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_namespace BASE TABLE None None None None None YES NO None
tm351test pg_catalog pg_rewrite BASE TABLE None None None None None YES NO None
tm351test public classifiedroostbats BASE TABLE None None None None None YES NO None
tm351test public classifiedhibernationbats BASE TABLE None None None None None YES NO None
tm351test information_schema sql_features BASE TABLE None None None None None YES NO None
tm351test information_schema sql_sizing BASE TABLE None None None None None YES NO None
In [7]:
#Reading in the data from the tables in the postgreSQL database initialised in the notebook PostrgreSQL_DB

I will select a period of one year initially

In [8]:
#These are the classified hibernation bats with the commonname bat replaced by the nearest neighbour algorithm and the erroneous 
#grid refences removed. This is therefore the cleaned data to work with.
Hibernation_data = pd.read_sql_query("SELECT * FROM ClassifiedHibernationBats",conn)
Hibernation_data.head()
Out[8]:
observationid recordkey gridreference year month latitude longitude commonname
0 463169251 1 TQ95 2008 1 51.29905055 0.802680508 Bechstein's Bat
1 463169252 2 TQ95 2008 9 51.29905055 0.802680508 Daubenton's Bat
2 463169254 4 TQ95 2008 9 51.29905055 0.802680508 Bechstein's Bat
3 463167926 5 TQ95 2008 1 51.29905055 0.802680508 Daubenton's Bat
4 463167927 6 TQ95 2008 9 51.29905055 0.802680508 Daubenton's Bat
In [9]:
len(Hibernation_data)
Out[9]:
22133
In [10]:
#checking there are no 'Bat' commonnames
batlist = Hibernation_data[Hibernation_data['commonname']=='Bat']
batlist
Out[10]:
observationid recordkey gridreference year month latitude longitude commonname
In [11]:
#counts for each commonname
cleaned = Hibernation_data.commonname.value_counts()
cleaned
Out[11]:
Natterer's Bat            5061
Lesser Horseshoe Bat      4547
Daubenton's Bat           3950
Greater Horseshoe Bat     2965
Brown Long-eared Bat      2835
Whiskered/Brandt's Bat    1324
Pipistrelle                568
Common Pipistrelle         278
Whiskered Bat              204
Soprano Pipistrelle        137
Brandt's Bat               135
Bechstein's Bat             75
Serotine                    45
Noctule Bat                  8
Grey Long-eared Bat          1
Name: commonname, dtype: int64

Comparing the unclassified data - Hibuncleaned_data with the k-nn classified data - Hibernation_data to ensure integrity has been maintained.

Comparison of cleaned and uncleaned hibernation data

In [12]:
#reading hibernation data from the database
Hibuncleaned_data = pd.read_sql_query("SELECT * FROM HibernationBats", conn)
len(Hibuncleaned_data)
Out[12]:
22133
In [13]:
#looking at the 'Bat' commonnmae
batlist1 = Hibuncleaned_data[Hibuncleaned_data['commonname']=='Bat']
batlist1.head()
Out[13]:
observationid recordkey gridreference commonname year month latitude longitude
2 463169253 3 TQ95 Bat 2008 9 51.29905055 0.802680508
7 463172437 9 TR15 Bat 2008 1 51.29938038 1.089541645
14 463164993 16 SJ59 Bat 2008 1 53.45436471 -2.671620082
62 463171705 64 SX87 Bat 2008 2 50.57902473 -3.618490044
145 463174131 149 TF01 Bat 2008 1 52.7387787 -0.438920572
In [14]:
#checking the commonname counts
uncleaned = Hibuncleaned_data.commonname.value_counts()
uncleaned
Out[14]:
Natterer's Bat            5056
Lesser Horseshoe Bat      4506
Daubenton's Bat           3945
Greater Horseshoe Bat     2963
Brown Long-eared Bat      2805
Whiskered/Brandt's Bat    1323
Bat                        557
Pipistrelle                412
Whiskered Bat              204
Brandt's Bat               135
Common Pipistrelle         109
Bechstein's Bat             75
Soprano Pipistrelle         21
Serotine                    14
Noctule Bat                  7
Grey Long-eared Bat          1
Name: commonname, dtype: int64
In [15]:
#Comparing classified and non-classified counts.
fig = plt.figure(figsize = (15,10))

# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
cleanedAxes = fig.add_subplot(111)
cleaned.plot.line(ax=cleanedAxes, ylim=[0,7000], color='red',fontsize=15)
cleanedAxes.set_ylabel('Counts',fontsize=20)
# second axes object representing the extra months (3,4,8,9,10) from the roost dataset, in the same place (twinned 
# with the ExtraAxes representing the extra months (3,4,8,9,10) from the hibernation dataset) shared (twiny() would
#allow sharing the y-axes).7000
uncleanedAxes = cleanedAxes.twinx()

# plot onto the ExtrasAxes:
uncleaned.plot(kind='bar',ax=uncleanedAxes,ylim=[0,7000], color='blue',fontsize=15)
# set the Extras Axes y-axis label,
uncleanedAxes.set_ylabel('Uncleaned Counts',fontsize=20)

# The mpatches.Patch() from matplotlib creates an element that can 
# be used in the handles parameter of the plt.legend() function to create an entry in the legend.
import matplotlib.patches as mpatches
legendpatch_list = ([mpatches.Patch(color='red', label='Cleaned Counts')]+
                    [mpatches.Patch(color='blue', label='Uncleaned Counts')])
                      
#the x,y values for the location are figure coordinates between 0 and 1.

plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)


# and a title
plt.title('Hibernation Uncleaned and cleaned counts compared', 
          fontsize=20)
Out[15]:
<matplotlib.text.Text at 0xac4b95ec>

This looks very similar and the pattern is the same, which is reassuring. Classifying the 'Bat' species has not markedly changed the shape of the data.

Selecting one year of data to compare to one year of Roost data

Between 1999, when trends from standardised large-scale monitoring became available through the National Bat Monitoring Programme (NBMP), and 2013, bat populations have increased by 23 per cent; an assessment of the underlying smoothed trend shows this is a statistically significant increase. (jncc.defra, 2014) C8. Mammals of the wider countryside (bats)

In [16]:
# converting from object to int
Hibernation_data.month = Hibernation_data.month.astype(int)
Hibernation_data.year = Hibernation_data.year.astype(int)
Hibernation_data.dtypes
Out[16]:
observationid    object
recordkey        object
gridreference    object
year              int32
month             int32
latitude         object
longitude        object
commonname       object
dtype: object
In [17]:
#Selecting the months from the study and the years from when better counting began. 2014 is a part year so is not included
SelectYears_df = Hibernation_data[(Hibernation_data['month'].isin([1,2])) 
                                  & (Hibernation_data['year']>=1999)&(Hibernation_data['year']< 2014)]
SelectYears_df.tail(10)
len(SelectYears_df)
SelectYears_df.commonname.value_counts()
Out[17]:
Natterer's Bat            2302
Daubenton's Bat           1771
Lesser Horseshoe Bat      1721
Brown Long-eared Bat      1276
Greater Horseshoe Bat      798
Whiskered/Brandt's Bat     640
Pipistrelle                238
Common Pipistrelle         143
Whiskered Bat               95
Brandt's Bat                76
Soprano Pipistrelle         50
Bechstein's Bat             32
Serotine                    25
Noctule Bat                  4
Grey Long-eared Bat          1
Name: commonname, dtype: int64
In [18]:
#grouping to get a reasonable format to plot.
grp = SelectYears_df.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
grp.unstack()
grp.plot(figsize=(15,8))
Out[18]:
<matplotlib.axes._subplots.AxesSubplot at 0xac38660c>
In [19]:
#These are the classified roost bats with the commonname bat replaced by the nearest neighbour algorithm and the erroneous 
#grid refences removed. This is therefore the cleaned data to work with.

Roost_data  = pd.read_sql_query("SELECT * FROM ClassifiedRoostBats",conn)
Roost_data.head()
Out[19]:
observationid recordkey gridreference year month latitude longitude commonname
0 463208902 2 TR12 2010 6 51.02961366 1.071575094 Serotine
1 463208903 3 TR12 2010 6 51.02961366 1.071575094 Serotine
2 463199389 4 TQ97 2010 10 51.478886 0.813687061 Serotine
3 463199390 5 TQ97 2010 6 51.478886 0.813687061 Serotine
4 463209418 6 TR12 2010 6 51.02961366 1.071575094 Brown Long-eared Bat
In [20]:
#minimum and maximum year
syear = Roost_data.year.min()
eyear = Roost_data.year.max()
print(syear, eyear)
1977 2014
In [21]:
#Selecting the uncleaned roost data from the database
Uncleaned_Roost_data = pd.read_sql_query("SELECT * FROM RoostBats", conn)
Uncleaned_Roost_data.head()
Out[21]:
observationid recordkey gridreference commonname year month latitude longitude
0 463208902 2 TR12 Serotine 2010 6 51.02961366 1.071575094
1 463208903 3 TR12 Serotine 2010 6 51.02961366 1.071575094
2 463199389 4 TQ97 Serotine 2010 10 51.478886 0.813687061
3 463199390 5 TQ97 Serotine 2010 6 51.478886 0.813687061
4 463209418 6 TR12 Brown Long-eared Bat 2010 6 51.02961366 1.071575094

Comparison of classified and unclassified data.

In [22]:
#generating commonname counts
cleanedRoost = Roost_data.commonname.value_counts()
In [23]:
#generating commonname counts
uncleanedRoost = Uncleaned_Roost_data.commonname.value_counts()

Comparing the cleaned and uncleaned datasets.

In [24]:
#Comparing classified and non-classified counts.
fig = plt.figure(figsize = (15,12))

# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
cleanAxes = fig.add_subplot(111)
cleanedRoost.plot.line(ax=cleanAxes, ylim=[0,7000], color='red',fontsize=15)
cleanAxes.set_ylabel('Counts',fontsize=20)
uncleanAxes = cleanAxes.twinx()

# plot onto the ExtrasAxes:
uncleanedRoost.plot(kind='bar',ax=uncleanAxes,ylim=[0,7000], color='blue',fontsize=15)
# set the Extras Axes y-axis label,
uncleanAxes.set_ylabel('Uncleaned Counts',fontsize=20)

plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)
plt.xlabel('Common name', fontsize=12)
plt.ylabel('Counts', fontsize=20)
# and a title
plt.title('Uncleaned and cleaned Roost counts compared', 
          fontsize=20)
Out[24]:
<matplotlib.text.Text at 0xabfa280c>
In [25]:
#Comparing classified and non-classified counts.
fig = plt.figure(figsize = (10,8))

plt.scatter(uncleaned,uncleanedRoost, color=['red','blue'], marker='h',s =50)

axes = plt.gca()
m, b = np.polyfit(uncleaned,uncleanedRoost, 1)
plt.annotate('Outlier', xy=(5000,6400), xytext=(4400,6800), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))
plt.annotate('Outlier', xy=(2800,2000), xytext=(2800,2500), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))
X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100)
plt.plot(X_plot, m*X_plot + b, '-', color='black')
plt.title('Correlation of uncleaned Roost and Hibernation Counts', fontsize=15, color='Black')
legendpatch_list = ([mpatches.Patch(color='red', label='Uncleaned Roost Counts')]+
                    [mpatches.Patch(color='blue', label='Uncleaned Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.6,0.2), fontsize=12)
plt.xlabel('Bat Counts (Hibernation)',fontsize=15)
plt.ylabel('Bat Counts (Roost)',fontsize=15)
#We can see a strong correlation between the roost and hibernation datasets
Out[25]:
<matplotlib.text.Text at 0xaaf5faac>
In [26]:
#pearson coefficient to show strength of correlation 
scipy.stats.pearsonr(uncleaned,uncleanedRoost)
Out[26]:
(0.97622106280625609, 1.0825433911543889e-10)
In [27]:
#Comparing classified Roost and Hibernation counts.
fig = plt.figure(figsize = (10,8))

plt.scatter(cleaned,cleanedRoost, color=['red','blue'], marker='h',s =50)

axes = plt.gca()
m, b = np.polyfit(cleaned, cleanedRoost, 1)
X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100)
plt.plot(X_plot, m*X_plot + b, '-', color='black')
plt.annotate('Outlier', xy=(5000,6400), xytext=(4500,6800), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))
plt.annotate('Outlier', xy=(2800,2000), xytext=(2800,2500), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))
plt.title('Correlation of cleaned Roost and Hibernation Counts', fontsize=15, color='Black')
legendpatch_list = ([mpatches.Patch(color='red', label='Cleaed Roost Counts')]+
                    [mpatches.Patch(color='blue', label='Cleaned Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.6,0.2), fontsize=12)
plt.xlabel('Bat Counts (Hibernation)',fontsize=15)
plt.ylabel('Bat Counts (Roost)',fontsize=15)
#We can see a strong correlation between the roost and hibernation datasets
Out[27]:
<matplotlib.text.Text at 0xaac822ac>
In [28]:
#Pearson coefficient calculation
scipy.stats.pearsonr(cleaned, cleanedRoost)
Out[28]:
(0.97400877748374237, 9.1759412620459035e-10)

Correlation of Roost cleaned and uncleaned This is misleading because I removed commonname 'Bat' to correlate because the numbers had to be the same. I won't continue with this.

In [29]:
unRoost = Uncleaned_Roost_data[Uncleaned_Roost_data.commonname != 'Bat']
In [30]:
unRoost.head()
Out[30]:
observationid recordkey gridreference commonname year month latitude longitude
0 463208902 2 TR12 Serotine 2010 6 51.02961366 1.071575094
1 463208903 3 TR12 Serotine 2010 6 51.02961366 1.071575094
2 463199389 4 TQ97 Serotine 2010 10 51.478886 0.813687061
3 463199390 5 TQ97 Serotine 2010 6 51.478886 0.813687061
4 463209418 6 TR12 Brown Long-eared Bat 2010 6 51.02961366 1.071575094
In [31]:
unRoosts = unRoost.commonname.value_counts()
In [32]:
scipy.stats.pearsonr(cleanedRoost, unRoosts)
Out[32]:
(0.99999640370723697, 8.0607184248506627e-35)
In [33]:
#converting to int month and year have object dtypes
Roost_data.month = Roost_data.month.astype(int)
Roost_data.year = Roost_data.year.astype(int)
Roost_data.dtypes
Out[33]:
observationid    object
recordkey        object
gridreference    object
year              int32
month             int32
latitude         object
longitude        object
commonname       object
dtype: object
In [34]:
#Selecting the months from the study and the years from when better counting began. 2014 is a part year so is not included
RoostYears_df = Roost_data[(Roost_data['month'].isin([5,6,7])) & 
                                 (Roost_data['year'] >=1999)&(Roost_data['year']< 2014)]
len(RoostYears_df)
Out[34]:
12744
In [35]:
grproost = RoostYears_df.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
grproost.unstack()
grproost.plot(figsize=(14,8))
Out[35]:
<matplotlib.axes._subplots.AxesSubplot at 0xab2c714c>

These are the combined roost and hibernation counts for the period 1999 - 2013

In [36]:
Combined_period = [RoostYears_df, SelectYears_df]
Combined_df = pd.concat(Combined_period)
Combined_df = Combined_df[Combined_period[0].columns]
Combined_df.head()
len(Combined_df)
Out[36]:
21916

The exploration below is a comparison with the visualisations produced by the Bat Conservation Tust in the 2014 Survey as described above. I will visulaise the priority species comtained in the PrioritySpecies list and the other species contained in the OtherSpecies list. This is to estimate the integrity of the data I have, to ensure it follows similar patterns as shown in the survey.

In [37]:
#Reading in the data downloaded from the JNCC/DEFRA report at http://jncc.defra.gov.uk/page-4271
defra = pd.read_csv('data/Normalised.csv')
defra
Out[37]:
Year All Species Daubenton's Bat Lesser Horseshoe Bat Noctule Common Pipistrelle Soprano Pipistrelle Serotine Natterer's Bat Brown Long-eared Bat
0 1999 100.00 100.0 100.0 100.0 100.0 100.0 100.0 0.0 0.0
1 2000 108.57 97.1 112.1 121.5 109.5 102.4 110.5 0.0 0.0
2 2001 111.53 105.3 113.9 81.3 138.8 129.9 109.4 0.0 100.0
3 2002 108.56 100.7 120.2 107.0 123.6 108.6 89.0 100.0 103.0
4 2003 115.82 106.8 126.5 141.8 134.8 108.8 98.8 104.9 91.8
5 2004 110.88 106.2 138.8 112.3 135.2 92.4 110.9 98.8 83.4
6 2005 126.31 106.9 146.8 135.8 155.9 102.2 112.5 126.7 110.4
7 2006 128.35 108.1 155.1 132.6 162.4 100.9 106.4 130.0 120.7
8 2007 127.04 108.8 151.9 187.4 145.8 94.9 92.5 114.5 123.5
9 2008 132.52 106.7 161.0 203.0 156.7 98.0 108.1 120.7 112.5
10 2009 130.48 103.3 175.2 176.4 168.8 94.2 101.5 126.5 106.4
11 2010 133.92 107.5 176.4 171.7 165.2 108.5 108.2 134.0 101.0
12 2011 134.10 107.3 198.1 159.2 172.4 91.1 103.2 124.7 126.1
13 2012 123.41 112.7 177.6 146.0 152.9 85.5 91.8 116.8 108.4
14 2013 130.92 110.8 196.5 139.0 173.0 84.7 106.1 126.5 119.6
15 2014 126.03 104.4 204.9 136.4 184.8 90.6 94.5 104.3 109.2
In [38]:
#Checking the columns
defra.columns
Out[38]:
Index(['Year', 'All Species', 'Daubenton's Bat', 'Lesser Horseshoe Bat',
       'Noctule', 'Common Pipistrelle', 'Soprano Pipistrelle', 'Serotine',
       'Natterer's Bat', 'Brown Long-eared Bat'],
      dtype='object')
In [39]:
#plotting the JNCC/DEFRA priority commmonname data
pri = defra[['Year','Lesser Horseshoe Bat','Soprano Pipistrelle','Brown Long-eared Bat','Noctule']]
ax = pri.plot(x='Year', figsize=(15,12))
plt.title('Defra Priority Bat Counts 1998 - 2014', fontsize=20, color='Black')
plt.legend(fontsize=15, loc=2)
ax.set_ylim(100,250)
plt.xlabel('Year', fontsize=20)
plt.ylabel('Bat Counts', fontsize=20)
Out[39]:
<matplotlib.text.Text at 0xaa54e68c>
In [40]:
#plotting the rest of the species
rest = defra[['Year', 'Common Pipistrelle','Serotine', "Natterer's Bat", "Daubenton's Bat"]]
ax = rest.plot(x='Year', figsize=(15,12))
plt.title('Defra Additional Bat Counts 1998 - 2014', fontsize=20, color='Black')
plt.legend(fontsize=15, loc=2)
ax.set_ylim(100,250)
plt.xlabel('Year', fontsize=20)
plt.ylabel('Bat Counts', fontsize=20)
Out[40]:
<matplotlib.text.Text at 0xaa55902c>
In [41]:
#Selecting my combined data to plot to compare with the DEFRA data
Priority = Combined_df[(Combined_df['commonname'] =='Lesser Horseshoe Bat')|
                       (Combined_df['commonname'] =='Greater Horseshoe Bat')|
                       (Combined_df['commonname'] =='Soprano Pipistrelle')|
                       (Combined_df['commonname'] =='Brown Long-eared Bat')|
                       (Combined_df['commonname'] =='Noctule Bat')]
Priority.commonname.value_counts()
Out[41]:
Lesser Horseshoe Bat     3693
Soprano Pipistrelle      2590
Brown Long-eared Bat     2468
Greater Horseshoe Bat    1086
Noctule Bat                31
Name: commonname, dtype: int64
In [42]:
#Grouping to achieve the same shape as the DEFRA data
priority = Priority.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
priority.unstack()
Out[42]:
                       year
Brown Long-eared Bat   1999     77
                       2000     81
                       2001    109
                       2002    136
                       2003    142
                       2004    132
                       2005    154
                       2006    179
                       2007    181
                       2008    223
                       2009    212
                       2010    237
                       2011    229
                       2012    173
                       2013    203
Greater Horseshoe Bat  1999     39
                       2000     36
                       2001     46
                       2002     69
                       2003     59
                       2004     72
                       2005     43
                       2006     70
                       2007     78
                       2008     84
                       2009     87
                       2010     99
                       2011     95
                       2012    110
                       2013     99
                              ... 
Noctule Bat            1999      2
                       2000      3
                       2001      2
                       2002      4
                       2003      2
                       2004      5
                       2005      2
                       2006      2
                       2007      0
                       2008      1
                       2009      0
                       2010      2
                       2011      4
                       2012      0
                       2013      2
Soprano Pipistrelle    1999    162
                       2000    155
                       2001     91
                       2002    144
                       2003    138
                       2004    163
                       2005    156
                       2006    150
                       2007    177
                       2008    212
                       2009    214
                       2010    234
                       2011    214
                       2012    185
                       2013    195
dtype: float64
In [43]:
#Plotting the priority species in my data
priority.plot(figsize=(15,12))
plt.legend(fontsize=15, loc=2)
plt.title('Priority Bat Counts 1999 - 2013', fontsize=20, color='Black')
plt.xlabel('Year', fontsize=20)
plt.ylabel('Bat Counts', fontsize=20)
Out[43]:
<matplotlib.text.Text at 0xaa27412c>
In [44]:
pd.unique(Combined_df['commonname'])
Out[44]:
array(['Serotine', 'Brown Long-eared Bat', 'Common Pipistrelle',
       'Soprano Pipistrelle', 'Lesser Horseshoe Bat', 'Pipistrelle',
       "Natterer's Bat", "Daubenton's Bat", 'Greater Horseshoe Bat',
       'Noctule Bat', 'Whiskered Bat', "Brandt's Bat",
       "Whiskered/Brandt's Bat", 'Grey Long-eared Bat', "Bechstein's Bat"], dtype=object)
In [45]:
#selected the non-priority species
Others = Combined_df[(Combined_df['commonname'] == 'Common Pipistrelle')|
                       (Combined_df['commonname'] == 'Serotine')|
                       (Combined_df['commonname'] == "Natterer's Bat")|
                       (Combined_df['commonname'] == "Daubenton's Bat")|
                       (Combined_df['commonname'] == "Whiskered/Brandt's Bat")]
Others.commonname.value_counts()
Out[45]:
Common Pipistrelle        3796
Natterer's Bat            2942
Daubenton's Bat           1870
Serotine                   664
Whiskered/Brandt's Bat     662
Name: commonname, dtype: int64
In [46]:
#shaping to plot
others = Others.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
others.unstack()
Out[46]:
                        year
Common Pipistrelle      1999    195
                        2000    167
                        2001    133
                        2002    194
                        2003    196
                        2004    231
                        2005    234
                        2006    300
                        2007    273
                        2008    290
                        2009    331
                        2010    341
                        2011    318
                        2012    303
                        2013    290
Daubenton's Bat         1999     83
                        2000    100
                        2001     99
                        2002    100
                        2003    105
                        2004    139
                        2005    123
                        2006    108
                        2007    118
                        2008    148
                        2009    167
                        2010    149
                        2011    156
                        2012    130
                        2013    145
                               ... 
Serotine                1999     47
                        2000     44
                        2001     38
                        2002     60
                        2003     48
                        2004     39
                        2005     41
                        2006     35
                        2007     29
                        2008     50
                        2009     57
                        2010     48
                        2011     39
                        2012     40
                        2013     49
Whiskered/Brandt's Bat  1999     26
                        2000     23
                        2001     33
                        2002     39
                        2003     36
                        2004     39
                        2005     30
                        2006     33
                        2007     40
                        2008     60
                        2009     58
                        2010     57
                        2011     56
                        2012     58
                        2013     74
dtype: int64
In [47]:
#plotting the others from my data
others.plot(figsize=(15,12))
plt.legend(fontsize=15, loc=2)
plt.title('Other Species Bat Counts 1999 - 2013', fontsize=20, color='Black')
plt.xlabel('Year', fontsize=20)
plt.ylabel('Bat Counts', fontsize=20)
Out[47]:
<matplotlib.text.Text at 0xaae61a4c>
In [48]:
#Writing the combined data to csv to create a table in postgreSQL
Combined_df.to_csv('data/CombinedCleanedData.csv', index = False)
In [49]:
pd.unique(Combined_df.commonname)
Out[49]:
array(['Serotine', 'Brown Long-eared Bat', 'Common Pipistrelle',
       'Soprano Pipistrelle', 'Lesser Horseshoe Bat', 'Pipistrelle',
       "Natterer's Bat", "Daubenton's Bat", 'Greater Horseshoe Bat',
       'Noctule Bat', 'Whiskered Bat', "Brandt's Bat",
       "Whiskered/Brandt's Bat", 'Grey Long-eared Bat', "Bechstein's Bat"], dtype=object)
In [50]:
#dfr.loc[dfr['commonname'] != 'Bat']
In [51]:
#Grouping my combined roost and hibernation cleaned data to plot
grpComb = Combined_df.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
grpComb.unstack()
grpComb.plot(figsize=(15,12))
plt.title('Combined Bat Counts 1999 - 2013', fontsize=20, color='Black')
plt.xlabel('Year', fontsize=20)
plt.ylabel('Bat Counts', fontsize=20)
Out[51]:
<matplotlib.text.Text at 0xa9cd10ec>

Interestingly the study completed by Defra - http://jncc.defra.gov.uk/page-4271 in December 2015 shows that the bats Daubenton’s bat, common pipistrelle and lesser horseshoe bat have increased in numbers which is supported by the plot above.

I will look at grey long-eared bats in the Roost dataset. These are thought to be in danger of becoming extinct. The Roost dataset covers all months.

In [52]:
GreyBat =Roost_data[Roost_data.commonname == 'Grey Long-eared Bat']
GreyBat
Out[52]:
observationid recordkey gridreference year month latitude longitude commonname
19638 463196990 25815 SX90 2013 6 49.95132869 -3.457907599 Grey Long-eared Bat
20768 463208363 27337 SX90 2014 10 49.95132869 -3.457907599 Grey Long-eared Bat
20769 463208364 27338 SX90 2014 6 49.95132869 -3.457907599 Grey Long-eared Bat
21601 463208861 28494 SZ50 2014 12 49.95073074 -1.227753192 Grey Long-eared Bat
21602 463208862 28495 SZ50 2014 6 49.95073074 -1.227753192 Grey Long-eared Bat
In [53]:
GreyBat1 = Hibernation_data[Hibernation_data.commonname == 'Grey Long-eared Bat']
GreyBat1
Out[53]:
observationid recordkey gridreference year month latitude longitude commonname
10436 463163965 11515 TQ53 2000 1 51.12020834 0.220295021 Grey Long-eared Bat
In [54]:
greys =(GreyBat, GreyBat1)
GreyBat2 =  pd.concat(greys)
GreyBat2 = GreyBat2[greys[0].columns]
GreyBat2.head()
Out[54]:
observationid recordkey gridreference year month latitude longitude commonname
19638 463196990 25815 SX90 2013 6 49.95132869 -3.457907599 Grey Long-eared Bat
20768 463208363 27337 SX90 2014 10 49.95132869 -3.457907599 Grey Long-eared Bat
20769 463208364 27338 SX90 2014 6 49.95132869 -3.457907599 Grey Long-eared Bat
21601 463208861 28494 SZ50 2014 12 49.95073074 -1.227753192 Grey Long-eared Bat
21602 463208862 28495 SZ50 2014 6 49.95073074 -1.227753192 Grey Long-eared Bat
In [55]:
GreyBat2.columns
Out[55]:
Index(['observationid', 'recordkey', 'gridreference', 'year', 'month',
       'latitude', 'longitude', 'commonname'],
      dtype='object')
In [56]:
#lonelybats = lonelybat[['latitude','longitude','commonname']]
#lonelybats.reset_index()
#lonelybats

The format of the dataframe created indexing issues so I created a dataframe and wrote it to csv. Running the maps was causing the kernel to crash, hence I created a 'Map' notepbook for this purpose.

In [57]:
#df2 = pd.DataFrame({'Latitude' : [49.890102,51.049767],
                   # 'Longitude' : [-3.5327096,0.13863303],
                   # 'commonname' : ['Grey Long-eared Bat', 'Grey Long-eared Bat']})
#df2
In [58]:
GreyBat2.to_csv('data/greybats.csv', index = False)

I didn't have enough data on grey long-eared bats to show anything significant so after mapping i abandoned this quest.

'The following represents an assessment of the weather experienced across the UK during winter 2010 / 2011 (December 2010 to February 2011) and how it compares with the 1981 to 2010 averages.

Mean temperatures over the UK were 4.8 °C below average during December, 0.6 °C below average in January and 1.7 °C above average in February. The UK mean temperature for the winter as a whole was 2.4 °C, making it less cold than winter 2009/10 which was 1.6 °C but still the second-coldest winter since 1985/86 with 2.3 °C. Over Scotland and Northern Ireland it was the second-coldest winter since 1985/86 and 1978/79 respectively, with again only last winter having been colder. Over Northern Ireland it was equal sixth-coldest winter in the series from 1910. Over England and Wales it was the second-coldest since 1995/96, with only last winter having been colder.' Met office records - http://www.metoffice.gov.uk/climate/uk

I have taken 2010 as an example because there was a peak here and because 2010 was a cold winter, I thought this was strange.

In [59]:
#Plot 2010 Hibernation
One_year = Hibernation_data[(Hibernation_data['month'].isin([1,2])) & 
                                 (Hibernation_data['year']==2010)]
One_year_grp = One_year.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
In [60]:
pd.unique(One_year.commonname)
Out[60]:
array(['Lesser Horseshoe Bat', 'Greater Horseshoe Bat', "Natterer's Bat",
       'Brown Long-eared Bat', "Daubenton's Bat", 'Whiskered Bat',
       "Brandt's Bat", 'Pipistrelle', 'Common Pipistrelle',
       "Whiskered/Brandt's Bat", "Bechstein's Bat", 'Serotine',
       'Soprano Pipistrelle'], dtype=object)
In [61]:
#getting the value counts of commonname for comparison
new = One_year.commonname.value_counts()
In [62]:
#Plotting hiberantion counts for 2010
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(new)))
ax=new.plot(kind='bar',figsize=(14,8), color=my_colors, fontsize=12)

plt.title('Hibernation Bat Counts 2010', fontsize=20)
plt.xlabel('Bat Types (Common Name)', fontsize=15)
plt.ylabel('Total Count', fontsize=15)
Out[62]:
<matplotlib.text.Text at 0xa99c4ccc>
In [63]:
#Selecting the months and the year to plot
RoostSelectYear_df = Roost_data[(Roost_data['month'].isin([5,6,7]))
& (Roost_data['year']== 2010)]
Roost_Year_grp = RoostSelectYear_df.groupby('year').apply(lambda x: x['commonname'].value_counts()).unstack().fillna(0)
Roost_Year_grp.unstack()
Roost_Year_grp.reset_index()
Out[63]:
commonname year 0
0 Common Pipistrelle 2010 319
1 Soprano Pipistrelle 2010 230
2 Lesser Horseshoe Bat 2010 158
3 Brown Long-eared Bat 2010 117
4 Pipistrelle 2010 70
5 Natterer's Bat 2010 52
6 Serotine 2010 43
7 Greater Horseshoe Bat 2010 25
8 Daubenton's Bat 2010 8
9 Whiskered Bat 2010 5
10 Whiskered/Brandt's Bat 2010 3
11 Brandt's Bat 2010 3
12 Noctule Bat 2010 2
In [64]:
#getting the counts
newroost = RoostSelectYear_df.commonname.value_counts()
newroost
Out[64]:
Common Pipistrelle        319
Soprano Pipistrelle       230
Lesser Horseshoe Bat      158
Brown Long-eared Bat      117
Pipistrelle                70
Natterer's Bat             52
Serotine                   43
Greater Horseshoe Bat      25
Daubenton's Bat             8
Whiskered Bat               5
Whiskered/Brandt's Bat      3
Brandt's Bat                3
Noctule Bat                 2
Name: commonname, dtype: int64
In [65]:
#plotting the rrost counts
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(newroost)))
ax=newroost.plot(kind='bar',figsize=(14,8),color=my_colors, fontsize=12)
plt.title('Roost Bat Counts 2010', fontsize=20, )
plt.xlabel('Bat Types (Common Name)', fontsize=15)
plt.ylabel('Total Count', fontsize=15)
Out[65]:
<matplotlib.text.Text at 0xa9919dcc>

The Bat Conservation Trust report that the Pipistrelle has disparity between the Hobernation period and the Roost period and the values certainly support this. Further work could involve classifying the bats in the areas where Pipistrelle are seen in the Roost months to try to determine if the problem lies with classification - perhaps Pipistrelle are more difficult to identify during Hibernation.

In [ ]:
 
In [66]:
roostmerge = pd.DataFrame(Roost_Year_grp)
hibmerge = pd.DataFrame(One_year_grp)
In [67]:
hibmerge.unstack()
hibmerge1 = hibmerge.reset_index()
hibmerge1.columns
Out[67]:
Index(['commonname', 'year', 0], dtype='object')
In [68]:
hibmerge1.rename(columns={0: 'count'}, inplace=True)
hibmerge1.head()
Out[68]:
commonname year count
0 Natterer's Bat 2010 258
1 Lesser Horseshoe Bat 2010 169
2 Daubenton's Bat 2010 141
3 Brown Long-eared Bat 2010 120
4 Greater Horseshoe Bat 2010 74
In [69]:
roostmerge.unstack()
roostmerge1 = roostmerge.reset_index()
roostmerge1.columns
Out[69]:
Index(['commonname', 'year', 0], dtype='object')
In [70]:
roostmerge1.rename(columns={0: 'count'}, inplace=True)
roostmerge1.columns
Out[70]:
Index(['commonname', 'year', 'count'], dtype='object')

Add in a plot where it shows both the hibernation dataset and the roost dataset on the same plot and do the same for the whole period if possible.

In [71]:
fig = plt.figure()
compare = fig.add_subplot(111)
new.plot(ax=compare, color='red',fontsize=18, figsize=(15,10),ylim=[0,350])
plt.title('Total Bats 2010', fontsize=25, color='Black')
plt.xlabel('Bat Name', fontsize=20)
plt.ylabel('Total Hibernation Count',fontsize=20)
comparison = compare.twinx()
newroost.plot(ax=comparison,color='blue',fontsize=18,ylim=[0,350])
comparison.set_ylabel('Total Roost Count',fontsize=20)
import matplotlib.patches as mpatches
patch_list = ([mpatches.Patch(color='red', label='Hibernation Counts')]+
                    [mpatches.Patch(color='blue', label='Roost Counts')])

plt.legend(handles=patch_list, loc=(0.6,0.2), fontsize=20)
#It would seem that there is a very similar pattern in the Roost data and Hibernation data for 2010. This plot is not quite as
#good as the two bar plots. I wonder if perhaps one with circles would be better. If there's time, i'll try it.
Out[71]:
<matplotlib.legend.Legend at 0xa98ef88c>

Checking the counts of pipistrelle for roost and hibernation to see how much they differ.

In [72]:
#pipistrelle common name hib
pip = One_year[One_year['commonname']=='Pipistrelle']
len(pip)
Out[72]:
23
In [73]:
#pipistrelle common name roost
pip1 = RoostSelectYear_df[RoostSelectYear_df['commonname']=='Pipistrelle']
len(pip1)
Out[73]:
70

Counting the hibernation and rrost datasets to produce a correlation plot.

In [74]:
count=RoostSelectYear_df['commonname'].value_counts()
count1=One_year['commonname'].value_counts()
count
Out[74]:
Common Pipistrelle        319
Soprano Pipistrelle       230
Lesser Horseshoe Bat      158
Brown Long-eared Bat      117
Pipistrelle                70
Natterer's Bat             52
Serotine                   43
Greater Horseshoe Bat      25
Daubenton's Bat             8
Whiskered Bat               5
Whiskered/Brandt's Bat      3
Brandt's Bat                3
Noctule Bat                 2
Name: commonname, dtype: int64
In [75]:
#RoostList

I removed the values that were in the Roost dataset but not in the hibernation dataset or vice versa for 2010. This is not concerning in any way, these species appear in other years and appears to be due to counting methods and not bat numbers. However I will check other years and if a pattern emerges with the same names appearing I will perform further analysis. I have checked the documentation from the Bat Conservation Trust and the bats removed are not specified in the Roost study.

In [76]:
Roost = list(pd.unique(RoostSelectYear_df.commonname))
Roost
Out[76]:
['Serotine',
 'Brown Long-eared Bat',
 'Common Pipistrelle',
 'Soprano Pipistrelle',
 'Lesser Horseshoe Bat',
 'Pipistrelle',
 "Natterer's Bat",
 "Daubenton's Bat",
 'Greater Horseshoe Bat',
 'Noctule Bat',
 'Whiskered Bat',
 "Brandt's Bat",
 "Whiskered/Brandt's Bat"]
In [77]:
One = list(pd.unique(One_year.commonname))
One
Out[77]:
['Lesser Horseshoe Bat',
 'Greater Horseshoe Bat',
 "Natterer's Bat",
 'Brown Long-eared Bat',
 "Daubenton's Bat",
 'Whiskered Bat',
 "Brandt's Bat",
 'Pipistrelle',
 'Common Pipistrelle',
 "Whiskered/Brandt's Bat",
 "Bechstein's Bat",
 'Serotine',
 'Soprano Pipistrelle']
In [78]:
isin = []
for v in One:
    if v not in Roost:
        isin.append(v)
isin
#remove this
Out[78]:
["Bechstein's Bat"]
In [79]:
isn = []
for v in Roost:
    if v not in One:
        isin.append(v)
isn
Out[79]:
[]
In [80]:
#dropping values causing mismatch
RoostSelectYear_df = RoostSelectYear_df.drop(RoostSelectYear_df[RoostSelectYear_df.commonname == 'Bechstein\'s Bat'].index)

pd.unique(RoostSelectYear_df.commonname)
Out[80]:
array(['Serotine', 'Brown Long-eared Bat', 'Common Pipistrelle',
       'Soprano Pipistrelle', 'Lesser Horseshoe Bat', 'Pipistrelle',
       "Natterer's Bat", "Daubenton's Bat", 'Greater Horseshoe Bat',
       'Noctule Bat', 'Whiskered Bat', "Brandt's Bat",
       "Whiskered/Brandt's Bat"], dtype=object)
In [81]:
#dropping values causing mismatch
One_year = One_year.drop(One_year[One_year.commonname == 'Noctule Bat'].index)
pd.unique(One_year.commonname)
Out[81]:
array(['Lesser Horseshoe Bat', 'Greater Horseshoe Bat', "Natterer's Bat",
       'Brown Long-eared Bat', "Daubenton's Bat", 'Whiskered Bat',
       "Brandt's Bat", 'Pipistrelle', 'Common Pipistrelle',
       "Whiskered/Brandt's Bat", "Bechstein's Bat", 'Serotine',
       'Soprano Pipistrelle'], dtype=object)

Merging the 2010 datasets for rrost and hibernation to plot together

In [82]:
new_hib=pd.DataFrame(count)
new_hib.reset_index()
new_hib.columns
new_hib = new_hib.rename(columns = {'commonname' : 'HibernationCount'})
new_hib.index.names = ['commonname']
new_hib.reset_index(level=0, inplace=True)
In [83]:
new_hib
Out[83]:
commonname HibernationCount
0 Common Pipistrelle 319
1 Soprano Pipistrelle 230
2 Lesser Horseshoe Bat 158
3 Brown Long-eared Bat 117
4 Pipistrelle 70
5 Natterer's Bat 52
6 Serotine 43
7 Greater Horseshoe Bat 25
8 Daubenton's Bat 8
9 Whiskered Bat 5
10 Whiskered/Brandt's Bat 3
11 Brandt's Bat 3
12 Noctule Bat 2

rename the count columns to hib_count and roost_count

In [84]:
new_roost=pd.DataFrame(count1)
new_roost.reset_index()
new_roost.columns
new_roost = new_roost.rename(columns = {'commonname' : 'RoostCount'})
new_roost.index.names = ['commonname']
new_roost.reset_index(level=0, inplace=True)
new_roost.columns
Out[84]:
Index(['commonname', 'RoostCount'], dtype='object')
In [85]:
combined = pd.merge(new_hib, new_roost, on=['commonname'])
combined
Out[85]:
commonname HibernationCount RoostCount
0 Common Pipistrelle 319 22
1 Soprano Pipistrelle 230 4
2 Lesser Horseshoe Bat 158 169
3 Brown Long-eared Bat 117 120
4 Pipistrelle 70 23
5 Natterer's Bat 52 258
6 Serotine 43 5
7 Greater Horseshoe Bat 25 74
8 Daubenton's Bat 8 141
9 Whiskered Bat 5 14
10 Whiskered/Brandt's Bat 3 54
11 Brandt's Bat 3 5

Make these plots better and show the comparison between the correlated plot and grabage plot to show the identification mey be the issue.

In [86]:
combined.plot.scatter(x='HibernationCount', y='RoostCount')
Out[86]:
<matplotlib.axes._subplots.AxesSubplot at 0xa98bf72c>

The survey data is taken from the included report from The Bat Consevaion Trust. It can be seen that there is little similarity between the commonnames listed and the ones in the datasets.

In [87]:
survey = pd.read_csv('data/surveys.csv')
survey
Out[87]:
commonname hibernation roost
0 Daubenton's Bat Hibernation No
1 Lesser Horseshoe Bat Hibernation Roost
2 Noctule Bat 0 0
3 Common Pipistrelle 0 Roost
4 Soprano Pipistrelle 0 Roost
5 Serotine 0 Roost
6 Natterer's bat Hibernation Roost
7 Brown Long-eared Bat 0 0
8 Greater Horseshoe Bat Hibernation Roost
9 Natterer's Bat Hibernation Roost
10 Whiskered/Brandt's Bat Hibernation 0
11 Bechstein's Bat 0 0
12 Brandt's Bat 0 0
13 Whiskered Bat 0 0
14 Daubenton's Bat Hibernation 0
In [88]:
all_merge = pd.merge(combined, survey, on='commonname')
all_merge.drop([7], inplace=True)
In [89]:
all_merge
Out[89]:
commonname HibernationCount RoostCount hibernation roost
0 Common Pipistrelle 319 22 0 Roost
1 Soprano Pipistrelle 230 4 0 Roost
2 Lesser Horseshoe Bat 158 169 Hibernation Roost
3 Brown Long-eared Bat 117 120 0 0
4 Natterer's Bat 52 258 Hibernation Roost
5 Serotine 43 5 0 Roost
6 Greater Horseshoe Bat 25 74 Hibernation Roost
8 Daubenton's Bat 8 141 Hibernation 0
9 Whiskered Bat 5 14 0 0
10 Whiskered/Brandt's Bat 3 54 Hibernation 0
11 Brandt's Bat 3 5 0 0

As can be seen form the table, the roost and hibernation counts have little in common, the commonnames are different for each and there are names in the data which are apparently not in the surveys.

In [90]:
#Adding colours
#from itertools import cycle, islice
#combined_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue', '#00E28E','#008DB8', '#00FF80',
                               #'#00C69C','red', 'blue', 'green', 'cyan', 'yellow']),None, len(combined)))

combined.plot.bar(x='commonname', subplots=True,color=['blue','red'], figsize=(15,10), fontsize=12)
plt.xlabel('Common Name', fontsize=18)
plt.ylabel('Total Count',fontsize=18)
Out[90]:
<matplotlib.text.Text at 0xa8c2a80c>
In [91]:
#plotting the correlation between roost and hibernation values for 2010
fig = plt.figure(figsize = (12,10))
ax.set_xlim(0,400)
ax.set_ylim(0,400)
plt.scatter(count,count1, color=['red','blue'], marker='h',s =50)
plt.grid.grid_line_color = None
axes = plt.gca()

m, b = np.polyfit(count, count1, 1)

X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100)
plt.plot(X_plot, m*X_plot + b, '-', color='black')

plt.title('Correlation of Roost and Hibernation Counts 2010', fontsize=20, color='Black')
legendpatch_list = ([mpatches.Patch(color='red', label='2010 Roost Counts')]+
                    [mpatches.Patch(color='blue', label='2010 Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.6,0.2), fontsize=15)
plt.xlabel('Bat Counts (Hibernation)', fontsize=15)
plt.ylabel('Bat Counts (Roost)', fontsize=15)
#We can see a strong correlation between the roost and hibernation datasets
Out[91]:
<matplotlib.text.Text at 0xa8b4ea6c>
In [92]:
#correlation values
scipy.stats.pearsonr(count,count1)
Out[92]:
(0.99024652048030692, 9.10646391995923e-11)

I'm unsure of these results and will revisit this.

In [93]:
#unclean hib counts
unsortedh = Hibernation_data.commonname.value_counts()
unsortedh
Out[93]:
Natterer's Bat            5061
Lesser Horseshoe Bat      4547
Daubenton's Bat           3950
Greater Horseshoe Bat     2965
Brown Long-eared Bat      2835
Whiskered/Brandt's Bat    1324
Pipistrelle                568
Common Pipistrelle         278
Whiskered Bat              204
Soprano Pipistrelle        137
Brandt's Bat               135
Bechstein's Bat             75
Serotine                    45
Noctule Bat                  8
Grey Long-eared Bat          1
Name: commonname, dtype: int64
In [94]:
#unclean roost counts
unsortedr = Roost_data.commonname.value_counts()
unsortedr
Out[94]:
Common Pipistrelle        6408
Soprano Pipistrelle       4294
Lesser Horseshoe Bat      4198
Pipistrelle               3437
Brown Long-eared Bat      1893
Serotine                  1107
Natterer's Bat            1042
Greater Horseshoe Bat      514
Daubenton's Bat            160
Whiskered Bat               97
Noctule Bat                 73
Brandt's Bat                60
Whiskered/Brandt's Bat      33
Bechstein's Bat             22
Grey Long-eared Bat          5
Name: commonname, dtype: int64
In [95]:
#correlation values
scipy.stats.pearsonr(unsortedh,unsortedr)
Out[95]:
(0.97400877748374237, 9.1759412620459035e-10)
In [ ]:
 

'Pearson’s R2 is a measure of correlation between two or more continuous variables. It can be visualised as drawing a line of best fit on a scatter plot. The R2 coefficient value indicates how well the values correlate. A value near +1 shows that the values vary together, while a value near −1 shows that they vary in opposite directions. A value near zero shows that there’s little correlation between the data. We generally need at least 50 data items before this measure can give sensible results. In Python, Pearson’s R2 is calculated by the scipy.stats.pearsonr method. This takes two series of data, corresponding to the x and y values, and returns R2 and a p value. The p value can be used to decide whether to reject the null hypothesis of no correlation. However, a naive application of this test can be misleading. The data should be examined first to see if attempts to fit a straight line to the data are sensible. Furthermore, R2 can give confused results if the values being correlated have very different ranges. In these cases, it can be useful to normalise the data by scaling each variable to fall into the range 0–1.

-If the p-value is low (generally less than 0.05), then your correlation is statistically significant, and we can detrmine that there is correlation between the datasets.' The Open University

Combine Roost and Hibernation Data after this to look at the period from 1999 - 2013 overall. This will help to rule out incidental apparent correlation results.

In [96]:
#combined roost and hiberantion data from 1999-2013
SelectYear = SelectYears_df[['commonname','year']]
pivoted = SelectYear.pivot_table(index=['commonname'], aggfunc='count')
pivoted.reset_index(['year','commonname'],inplace=True)
pivoted = pivoted.rename(columns = {'year' : 'Count'})
pivoted.head(20)
Out[96]:
commonname Count
0 Bechstein's Bat 32
1 Brandt's Bat 76
2 Brown Long-eared Bat 1276
3 Common Pipistrelle 143
4 Daubenton's Bat 1771
5 Greater Horseshoe Bat 798
6 Grey Long-eared Bat 1
7 Lesser Horseshoe Bat 1721
8 Natterer's Bat 2302
9 Noctule Bat 4
10 Pipistrelle 238
11 Serotine 25
12 Soprano Pipistrelle 50
13 Whiskered Bat 95
14 Whiskered/Brandt's Bat 640
In [97]:
#grouping to plot
SelectYear1 = RoostYears_df[['commonname','year']]
pivoted1 = SelectYear1.pivot_table(index=['commonname'], aggfunc='count')
pivoted1.reset_index(['year','commonname'],inplace=True)
pivoted1 = pivoted1.rename(columns = {'year' : 'Count'})
pivoted1.head(20)
Out[97]:
commonname Count
0 Bechstein's Bat 10
1 Brandt's Bat 26
2 Brown Long-eared Bat 1192
3 Common Pipistrelle 3653
4 Daubenton's Bat 99
5 Greater Horseshoe Bat 288
6 Grey Long-eared Bat 1
7 Lesser Horseshoe Bat 1972
8 Natterer's Bat 640
9 Noctule Bat 27
10 Pipistrelle 1589
11 Serotine 639
12 Soprano Pipistrelle 2540
13 Whiskered Bat 46
14 Whiskered/Brandt's Bat 22

This result shows a small, positive correlation with a high p value, so there's not much correlation as can be seen on the plot below. This is the expected result for the period and it strengthens the result for the correlation of the hibernation and roost counts for the selected year of 2010.

In [98]:
#Comparison of roost and hibernation counts 1999-2013
fig = plt.figure(figsize = (25,15))
ax = pivoted.plot.barh(color='blue',stacked='True', figsize = (12,8))
pivoted1.plot.barh(ax=ax,color='red')
ax.set_yticklabels(list(pivoted1['commonname']))
legendpatch_list = ([mpatches.Patch(color='blue', label='1999 - 2013 Roost Counts')]+
                    [mpatches.Patch(color='red', label='1999 - 2013 Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.7,0.5), fontsize=12)
Out[98]:
<matplotlib.legend.Legend at 0xa8638a4c>
<matplotlib.figure.Figure at 0xa884ee6c>
In [99]:
#Total combined counts for 1999 - 2013
totals = Combined_df[['commonname','year']]
totals = Combined_df.pivot_table(index=['commonname'], aggfunc='count')
totals.reset_index(['year','commonname'],inplace=True)
totals = pivoted1.rename(columns = {'year' : 'Count'})
totals
Out[99]:
commonname Count
0 Bechstein's Bat 10
1 Brandt's Bat 26
2 Brown Long-eared Bat 1192
3 Common Pipistrelle 3653
4 Daubenton's Bat 99
5 Greater Horseshoe Bat 288
6 Grey Long-eared Bat 1
7 Lesser Horseshoe Bat 1972
8 Natterer's Bat 640
9 Noctule Bat 27
10 Pipistrelle 1589
11 Serotine 639
12 Soprano Pipistrelle 2540
13 Whiskered Bat 46
14 Whiskered/Brandt's Bat 22
In [100]:
#plotting 1999 - 2013
ax = totals.plot(kind='bar')
ax.set_xticklabels(list(totals.commonname))
plt.title('Total Counts 1999-2013', fontsize=20, color='Blue')
plt.xlabel('Bat Types (Common Name)')
plt.ylabel('Total Count')
Out[100]:
<matplotlib.text.Text at 0xa8605a0c>

Selecting one year of data to compare to one year of Hibernation data

Looking at the extra months data for 2010 to reveiew exta counts within unexpected months.

In [101]:
#selecting 2010
One_year_extra = Hibernation_data[Hibernation_data['year']==2010]
One_year_extra.head()
Out[101]:
observationid recordkey gridreference year month latitude longitude commonname
1728 463173795 1925 SS53 2010 1 51.11927932 -4.065889493 Lesser Horseshoe Bat
1729 463173796 1926 SS53 2010 1 51.11927932 -4.065889493 Greater Horseshoe Bat
1730 463173797 1927 SS53 2010 1 51.11927932 -4.065889493 Natterer's Bat
1731 463176374 1928 SS53 2010 1 51.11927932 -4.065889493 Lesser Horseshoe Bat
1732 463178159 1929 SS54 2010 1 51.20920073 -4.069910501 Greater Horseshoe Bat
In [102]:
#selecting 2010
Roost_year_extra = Roost_data[Roost_data['year']==2010]
Roost_year_extra.head()
Out[102]:
observationid recordkey gridreference year month latitude longitude commonname
0 463208902 2 TR12 2010 6 51.02961366 1.071575094 Serotine
1 463208903 3 TR12 2010 6 51.02961366 1.071575094 Serotine
2 463199389 4 TQ97 2010 10 51.478886 0.813687061 Serotine
3 463199390 5 TQ97 2010 6 51.478886 0.813687061 Serotine
4 463209418 6 TR12 2010 6 51.02961366 1.071575094 Brown Long-eared Bat
In [103]:
#Look at actual hibernation months - not study hibernation months
winter_months_df = One_year_extra[One_year_extra['month'].isin([1,2])]
winter_months_df.head()
Out[103]:
observationid recordkey gridreference year month latitude longitude commonname
1728 463173795 1925 SS53 2010 1 51.11927932 -4.065889493 Lesser Horseshoe Bat
1729 463173796 1926 SS53 2010 1 51.11927932 -4.065889493 Greater Horseshoe Bat
1730 463173797 1927 SS53 2010 1 51.11927932 -4.065889493 Natterer's Bat
1731 463176374 1928 SS53 2010 1 51.11927932 -4.065889493 Lesser Horseshoe Bat
1732 463178159 1929 SS54 2010 1 51.20920073 -4.069910501 Greater Horseshoe Bat
In [104]:
#hibernation value counts
winter_months_df.commonname.value_counts()
Out[104]:
Natterer's Bat            258
Lesser Horseshoe Bat      169
Daubenton's Bat           141
Brown Long-eared Bat      120
Greater Horseshoe Bat      74
Whiskered/Brandt's Bat     54
Pipistrelle                23
Common Pipistrelle         22
Whiskered Bat              14
Serotine                    5
Brandt's Bat                5
Soprano Pipistrelle         4
Bechstein's Bat             2
Name: commonname, dtype: int64
In [105]:
#looking at counts in the summer months in the hibernation data
summerhib_months_df = One_year_extra[One_year_extra['month'].isin([5,6,7])]
len(summerhib_months_df)
Out[105]:
133
In [106]:
summerhib_months_df.commonname.value_counts()
Out[106]:
Natterer's Bat            32
Lesser Horseshoe Bat      26
Greater Horseshoe Bat     25
Daubenton's Bat           17
Brown Long-eared Bat      14
Whiskered/Brandt's Bat     7
Pipistrelle                7
Brandt's Bat               2
Whiskered Bat              2
Soprano Pipistrelle        1
Name: commonname, dtype: int64
In [107]:
#These are months where there should be no hibernating bats
extrahib_months_df =One_year_extra[One_year_extra['month'].isin([3,4,8,9,10,11,12])]
len(extrahib_months_df)
Out[107]:
365
In [108]:
#these are the months where bats shouldn't hibernate
extrahib_months_df.commonname.value_counts()
Out[108]:
Natterer's Bat            90
Greater Horseshoe Bat     73
Lesser Horseshoe Bat      62
Brown Long-eared Bat      62
Daubenton's Bat           56
Pipistrelle                9
Soprano Pipistrelle        5
Whiskered/Brandt's Bat     5
Common Pipistrelle         2
Serotine                   1
Name: commonname, dtype: int64
In [109]:
#these are reported roosting months
summer_months_df = Roost_year_extra[Roost_year_extra['month'].isin([5,6,7])]
summer_months_df.head()
Out[109]:
observationid recordkey gridreference year month latitude longitude commonname
0 463208902 2 TR12 2010 6 51.02961366 1.071575094 Serotine
1 463208903 3 TR12 2010 6 51.02961366 1.071575094 Serotine
3 463199390 5 TQ97 2010 6 51.478886 0.813687061 Serotine
4 463209418 6 TR12 2010 6 51.02961366 1.071575094 Brown Long-eared Bat
5 463209419 7 TR12 2010 6 51.02961366 1.071575094 Brown Long-eared Bat
In [110]:
summer_months_df.commonname.value_counts()
Out[110]:
Common Pipistrelle        319
Soprano Pipistrelle       230
Lesser Horseshoe Bat      158
Brown Long-eared Bat      117
Pipistrelle                70
Natterer's Bat             52
Serotine                   43
Greater Horseshoe Bat      25
Daubenton's Bat             8
Whiskered Bat               5
Whiskered/Brandt's Bat      3
Brandt's Bat                3
Noctule Bat                 2
Name: commonname, dtype: int64
In [111]:
#these are the months for hibernation in the roost data
winterroost_months_df = Roost_year_extra[Roost_year_extra['month'].isin([1,2])]
len(winterroost_months_df)
Out[111]:
128
In [112]:
winterroost_months_df.commonname.value_counts()
Out[112]:
Common Pipistrelle       30
Pipistrelle              29
Soprano Pipistrelle      26
Lesser Horseshoe Bat     26
Brown Long-eared Bat      8
Natterer's Bat            5
Greater Horseshoe Bat     2
Serotine                  1
Brandt's Bat              1
Name: commonname, dtype: int64
In [113]:
#these are neither roost or hibernation months
extraroost_months_df = Roost_year_extra[Roost_year_extra['month'].isin([3,4,8,9,10,11,12])]
print(len(extraroost_months_df),len(extrahib_months_df))
#A similar result
342 365
In [114]:
groupeddata6 = extraroost_months_df.groupby(['commonname'])
grouped6 = groupeddata6['commonname'].aggregate('count')
groupeddata6.size()
#experimenting with grouping to try to get a sutable format extraroost_months_df.
Out[114]:
commonname
Brandt's Bat               1
Brown Long-eared Bat      34
Common Pipistrelle       111
Daubenton's Bat            1
Greater Horseshoe Bat     10
Lesser Horseshoe Bat      71
Natterer's Bat            20
Pipistrelle               18
Serotine                  11
Soprano Pipistrelle       64
Whiskered Bat              1
dtype: int64
In [115]:
#removing this as it only appears in the rrost counts in very small numbers
extraroost_months_df = extraroost_months_df.drop(extraroost_months_df[extraroost_months_df.commonname == "Whiskered Bat"].index)
In [116]:
extrar=extraroost_months_df.commonname.value_counts()
extrar
Out[116]:
Common Pipistrelle       111
Lesser Horseshoe Bat      71
Soprano Pipistrelle       64
Brown Long-eared Bat      34
Natterer's Bat            20
Pipistrelle               18
Serotine                  11
Greater Horseshoe Bat     10
Brandt's Bat               1
Daubenton's Bat            1
Name: commonname, dtype: int64
In [117]:
groupeddata7 = extrahib_months_df.groupby(['commonname'])
grouped7 = groupeddata7['commonname'].aggregate('count')
groupeddata7.size()
#experimenting with grouping to try to get a sutable format extrahib_months_df.
Out[117]:
commonname
Brown Long-eared Bat      62
Common Pipistrelle         2
Daubenton's Bat           56
Greater Horseshoe Bat     73
Lesser Horseshoe Bat      62
Natterer's Bat            90
Pipistrelle                9
Serotine                   1
Soprano Pipistrelle        5
Whiskered/Brandt's Bat     5
dtype: int64
In [118]:
#getting counts to compare
extrah=extrahib_months_df.commonname.value_counts()
extrah
Out[118]:
Natterer's Bat            90
Greater Horseshoe Bat     73
Lesser Horseshoe Bat      62
Brown Long-eared Bat      62
Daubenton's Bat           56
Pipistrelle                9
Soprano Pipistrelle        5
Whiskered/Brandt's Bat     5
Common Pipistrelle         2
Serotine                   1
Name: commonname, dtype: int64

Using correlation to see the relationship between the extra months

In [119]:
scipy.stats.pearsonr(extrar,extrah)
Out[119]:
(0.88695701996513654, 0.0006219971162897015)

The outliers in the plot below don't show statistical significance as can be shown by the value of results of the pearson coefficient. The plot has good correlation with a value of r near to 1 and a small p value.

In [120]:
fig = plt.figure(figsize = (12,10))

plt.scatter(extrar,extrah, color=['red','blue'], marker='h',s =50)

axes = plt.gca()
m, b = np.polyfit(extrar,extrah, 1)
plt.annotate('Outlier', xy=(20,58), xytext=(22,70), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))
plt.annotate('Outlier', xy=(33,62), xytext=(37,74), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))

X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100)
plt.plot(X_plot, m*X_plot + b, '-', color='black')
plt.title('Correlation of Extra Roost and Hibernation Counts 2010', fontsize=20, color='Black')
legendpatch_list = ([mpatches.Patch(color='red', label='2010 Extra Roost Counts')]+
                    [mpatches.Patch(color='blue', label='2010 Extra Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.6,0.2), fontsize=12)
plt.xlabel('Bat Counts (Hibernation)', fontsize=15)
plt.ylabel('Bat Counts (Roost)', fontsize=15)
#We can see a strong correlation between the roost and hibernation datasets however I will look at the outliers
#The outliers can be explained because the types of bats counted in the roost and hibernation datasets are different and the ones
#covering the extra months are likely to be more inconsistent than the official counts.
#Overall the pattern is good.
Out[120]:
<matplotlib.text.Text at 0xa7ad716c>
In [121]:
extra_hib=pd.DataFrame(extrah)
extra_hib.reset_index()
extra_hib.columns
extra_hib = extra_hib.rename(columns = {'commonname' : 'HibernationCount'})
extra_hib.index.names = ['commonname']
extra_hib.reset_index(level=0, inplace=True)
In [122]:
roost_e=pd.DataFrame(extrar)
roost_e.reset_index()
roost_e.columns
roost_e = roost_e.rename(columns = {'commonname' : 'RoostCount'})
roost_e.index.names = ['commonname']
roost_e.reset_index(level=0, inplace=True)
In [123]:
whole_extra = pd.merge(extra_hib, roost_e, on=['commonname'])
In [124]:
#Adding colours
from itertools import cycle, islice
e_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue', '#00E28E','#008DB8', '#00FF80',
                               '#00C69C','red', 'blue', 'green', 'cyan', 'yellow']),None, len(whole_extra)))

whole_extra.plot.bar(x='commonname',subplots=True,color=['blue','red'], figsize=(15,10), fontsize=12)
plt.xlabel('Common Name', fontsize=18)
plt.ylabel('Total Count',fontsize=18)
Out[124]:
<matplotlib.text.Text at 0xa77f030c>
In [125]:
diff1 = extrar- extrah

#Histogram of count differences which follow a normal distibution as expected
diff1.plot(kind='hist',bins=10,figsize=(10,8))
Out[125]:
<matplotlib.axes._subplots.AxesSubplot at 0xa7ad38cc>

looking at pipistrelle to try to find the missing ones, hibernation months can cover 11,12,1,2. Looking at 1999-2013

In [126]:
#further pipistrelle exloration, where do they go in winter?
pip_df = Roost_data[Roost_data['month'].isin([1,2,11,12])]
In [127]:
#They go into the roost dataset by mistake?
pip = pip_df[pip_df['commonname'] == 'Pipistrelle']
len(pip)
Out[127]:
734
In [128]:
#how many are here?
pip_df1 = Hibernation_data[Hibernation_data['month'].isin([1,2,11,12])]
In [129]:
#hmmm strange
pip1 = pip_df1[pip_df1['commonname'] == 'Pipistrelle']
len(pip1)
Out[129]:
390

There seem to be quite a lot of Pipistrelles in the Roost data, so will look at 2010

In [130]:
pip_df10 = RoostSelectYear_df[RoostSelectYear_df['month'].isin([1,2,11,12])]#Roost
pip10 = pip_df10[pip_df10['commonname'] == 'Pipistrelle']
len(pip10)
Out[130]:
0
In [131]:
piph_df10 = One_year[One_year['month'].isin([1,2,11,12])]#Hibernation
piph10 = piph_df10[piph_df10['commonname'] == 'Pipistrelle']
len(piph10)
Out[131]:
23

It would appear that over one year there is not too much of an issue but over the whole period there are 734 Pipistrelles in the Roost dataset which should be in the Hibernation one.

Looking at the overall counts for 1999 - 2013

In [132]:
#Hibernation data
groupeddata = Hibernation_data.groupby(['commonname'])
grouped = groupeddata['commonname'].aggregate('count')
groupeddata.size()
#experimenting with grouping to try to get a sutable format
Out[132]:
commonname
Bechstein's Bat             75
Brandt's Bat               135
Brown Long-eared Bat      2835
Common Pipistrelle         278
Daubenton's Bat           3950
Greater Horseshoe Bat     2965
Grey Long-eared Bat          1
Lesser Horseshoe Bat      4547
Natterer's Bat            5061
Noctule Bat                  8
Pipistrelle                568
Serotine                    45
Soprano Pipistrelle        137
Whiskered Bat              204
Whiskered/Brandt's Bat    1324
dtype: int64
In [133]:
#plotting all the hibernating bats
totals = groupeddata.sum()
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(new)))
grouped.plot(kind='bar', color=my_colors, figsize =(20,8), fontsize=12)
plt.title('Total Hibernating Bats 1999 - 2013', fontsize=20, color='Black')
plt.xlabel('Bat Name', fontsize=15)
plt.ylabel('Total Count',fontsize=15)
Out[133]:
<matplotlib.text.Text at 0xa831488c>
In [134]:
#Roost data grouping to plot
groupeddata1 = Roost_data.groupby(['commonname'])
grouped1 = groupeddata1['commonname'].aggregate('count')
groupeddata1.size()
Out[134]:
commonname
Bechstein's Bat             22
Brandt's Bat                60
Brown Long-eared Bat      1893
Common Pipistrelle        6408
Daubenton's Bat            160
Greater Horseshoe Bat      514
Grey Long-eared Bat          5
Lesser Horseshoe Bat      4198
Natterer's Bat            1042
Noctule Bat                 73
Pipistrelle               3437
Serotine                  1107
Soprano Pipistrelle       4294
Whiskered Bat               97
Whiskered/Brandt's Bat      33
dtype: int64
In [135]:
#plotting total roost bats overall
grouped1.plot(kind='bar',color=my_colors, figsize =(20,8), fontsize=12)
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(new)))
plt.title('Total Roost Bats 1999 -2013', fontsize=20, color='Black')
plt.xlabel('Bat Name', fontsize=15)
plt.ylabel('Total Count',fontsize=15)
Out[135]:
<matplotlib.text.Text at 0xa814c08c>
In [136]:
#counts to compare
roosts = Roost_data.commonname.value_counts()
roosts
Out[136]:
Common Pipistrelle        6408
Soprano Pipistrelle       4294
Lesser Horseshoe Bat      4198
Pipistrelle               3437
Brown Long-eared Bat      1893
Serotine                  1107
Natterer's Bat            1042
Greater Horseshoe Bat      514
Daubenton's Bat            160
Whiskered Bat               97
Noctule Bat                 73
Brandt's Bat                60
Whiskered/Brandt's Bat      33
Bechstein's Bat             22
Grey Long-eared Bat          5
Name: commonname, dtype: int64
In [137]:
#counts to compare
hibs = Hibernation_data.commonname.value_counts()
hibs
Out[137]:
Natterer's Bat            5061
Lesser Horseshoe Bat      4547
Daubenton's Bat           3950
Greater Horseshoe Bat     2965
Brown Long-eared Bat      2835
Whiskered/Brandt's Bat    1324
Pipistrelle                568
Common Pipistrelle         278
Whiskered Bat              204
Soprano Pipistrelle        137
Brandt's Bat               135
Bechstein's Bat             75
Serotine                    45
Noctule Bat                  8
Grey Long-eared Bat          1
Name: commonname, dtype: int64
In [138]:
#good correlation
scipy.stats.pearsonr(roosts,hibs)
Out[138]:
(0.97400877748374237, 9.1759412620459035e-10)
In [139]:
#Getting the hibernating months
winterhib_months_df = Hibernation_data[Hibernation_data['month'].isin([1,2])]
In [140]:
#Getting the hibernating months
winterroost_months_df = Roost_data[Roost_data['month'].isin([1,2])]
In [141]:
#Getting the hibernating sumer values
summerhib_months_df = Hibernation_data[Hibernation_data['month'].isin([5,6,7])]
In [142]:
#Getting the roost summer values
summerroost_months_df = Roost_data[Roost_data['month'].isin([5,6,7])]
In [143]:
#Combining the winter months in the hibernation dataset with the winter months in the roost dataset 
winter_months = [winterroost_months_df, winterhib_months_df]
winter_df = pd.concat(winter_months)
winter_df = winter_df[winter_months[0].columns]
len(winter_df)
Out[143]:
14418
In [144]:
#Total 'correct' winter months
groupeddata2 = winter_df.groupby(['commonname'])
grouped2 = groupeddata2['commonname'].aggregate('count')
groupeddata2.size()
Out[144]:
commonname
Bechstein's Bat             46
Brandt's Bat                86
Brown Long-eared Bat      1769
Common Pipistrelle         668
Daubenton's Bat           2325
Greater Horseshoe Bat     1224
Grey Long-eared Bat          1
Lesser Horseshoe Bat      2930
Natterer's Bat            3063
Noctule Bat                 14
Pipistrelle                856
Serotine                    77
Soprano Pipistrelle        357
Whiskered Bat              135
Whiskered/Brandt's Bat     867
dtype: int64
In [145]:
#plotting winter bats
grouped2.plot(kind='bar',color=my_colors, figsize =(20,8), fontsize=12)
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(new)))
plt.title('Total Winter Bats 1999 -2013', fontsize=20, color='Black')
plt.xlabel('Bat Name', fontsize=15)
plt.ylabel('Total Count',fontsize=15)
Out[145]:
<matplotlib.text.Text at 0xa7e3a12c>
In [146]:
#Combining the summer months in the roost dataset with the summer months in the hibernation dataset 
summer_months = [summerroost_months_df, summerhib_months_df]
summer_df = pd.concat(summer_months)
summer_df = summer_df[summer_months[0].columns]
len(summer_df)
Out[146]:
17900
In [147]:
#Total 'correct' summer months
groupeddata3 = summer_df.groupby(['commonname'])
grouped3 = groupeddata3['commonname'].aggregate('count')
groupeddata3.size()
Out[147]:
commonname
Bechstein's Bat             15
Brandt's Bat                61
Brown Long-eared Bat      1625
Common Pipistrelle        4454
Daubenton's Bat            461
Greater Horseshoe Bat      558
Grey Long-eared Bat          3
Lesser Horseshoe Bat      3080
Natterer's Bat            1150
Noctule Bat                 51
Pipistrelle               2406
Serotine                   820
Soprano Pipistrelle       3014
Whiskered Bat               81
Whiskered/Brandt's Bat     121
dtype: int64
In [148]:
#plotting total summer bats
grouped3.plot(kind='bar',color=my_colors, figsize =(20,8), fontsize=12)
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(new)))
plt.title('Total Summer Bats 1999 -2013', fontsize=20, color='Black')
plt.xlabel('Bat Name', fontsize=15)
plt.ylabel('Total Count',fontsize=15)
Out[148]:
<matplotlib.text.Text at 0xa7d8f58c>
In [149]:
#Roost counts

fig = plt.figure(figsize = (20,10))

# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
SummerAxes = fig.add_subplot(111)
grouped1.plot(kind='bar',ax=SummerAxes, ylim=[0,7000], color='red',fontsize=15)
SummerAxes.set_ylabel('Roost Counts',fontsize=20)
# second axes object representing the extra counts from the hibernation dataset, in the same place (twinned 
# with the SummerAxes) shared (twiny() would allow sharing the y-axes).
ExtrasAxes = SummerAxes.twinx()

# plot onto the ExtrasAxes:
grouped3.plot.line(ax=ExtrasAxes,ylim=[0,7000], color='blue',fontsize=15)
# set the Extras Axes y-axis label,
ExtrasAxes.set_ylabel('Extra summmer Counts',fontsize=20)

# The mpatches.Patch() from matplotlib creates an element that can 
# be used in the handles parameter of the plt.legend() function to create an entry in the legend.

import matplotlib.patches as mpatches
legendpatch_list = ([mpatches.Patch(color='red', label='Roost Counts')]+
                    [mpatches.Patch(color='blue', label='Extra summer Counts')])
                      
#the x,y values for the location are figure coordinates between 0 and 1.
#  If you exceed 1, the legend is drawn outside the figure as shown.
plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)


# and a title
plt.title('Roost and Roost with extra summer counts compared', 
          fontsize=20)
#It would seem that there is a very similar pattern in the Roost data for the original data and the data selected for the correct 
#months combined with the hibernation summer months data
Out[149]:
<matplotlib.text.Text at 0xa7d4050c>
In [150]:
#Hibernation Counts for the winter months
fig = plt.figure(figsize = (20,10))

# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
WinterAxes = fig.add_subplot(111)
grouped.plot(kind='bar',ax=WinterAxes, ylim=[0,7000], color='yellow',fontsize=15)
WinterAxes.set_ylabel('Total Hibernation Counts',fontsize=20)
# second axes object representing the extra counts from the roost dataset, in the same place (twinned 
# with the SummerAxes) shared (twiny() would allow sharing the y-axes).
ExtraAxes = WinterAxes.twinx()


# plot onto the ExtrasAxes:
grouped2.plot.line(ax=ExtraAxes,ylim=[0,7000], color='green',fontsize=15 )
# set the Extras Axes y-axis label,
ExtraAxes.set_ylabel('Total winter Counts',fontsize=20)

# The mpatches.Patch() from matplotlib creates an element that can 
# be used in the handles parameter of the plt.legend() function to create an entry in the legend.
import matplotlib.patches as mpatches
legendpatch_list = ([mpatches.Patch(color='yellow', label='Total Hibernation Counts')]+
                    [mpatches.Patch(color='green', label='Total winter Counts')])
#the x,y values for the location are figure coordinates between 0 and 1.
#  If you exceed 1, the legend is drawn outside the figure as shown.
plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)

# and a title
plt.title('Hibernation and extra winter counts compared', 
          fontsize=20)
#It would seem that there is a similar pattern in the Hibernation data for the original data and the data selected for the correct 
#months combined with the roost winter months data
Out[150]:
<matplotlib.text.Text at 0xa6a4996c>
In [151]:
#getting the summer value counts
summer = summer_df.commonname.value_counts()
summer
Out[151]:
Common Pipistrelle        4454
Lesser Horseshoe Bat      3080
Soprano Pipistrelle       3014
Pipistrelle               2406
Brown Long-eared Bat      1625
Natterer's Bat            1150
Serotine                   820
Greater Horseshoe Bat      558
Daubenton's Bat            461
Whiskered/Brandt's Bat     121
Whiskered Bat               81
Brandt's Bat                61
Noctule Bat                 51
Bechstein's Bat             15
Grey Long-eared Bat          3
Name: commonname, dtype: int64
In [152]:
#getting the winter value counts
winter = winter_df.commonname.value_counts()
winter
Out[152]:
Natterer's Bat            3063
Lesser Horseshoe Bat      2930
Daubenton's Bat           2325
Brown Long-eared Bat      1769
Greater Horseshoe Bat     1224
Whiskered/Brandt's Bat     867
Pipistrelle                856
Common Pipistrelle         668
Soprano Pipistrelle        357
Whiskered Bat              135
Brandt's Bat                86
Serotine                    77
Bechstein's Bat             46
Noctule Bat                 14
Grey Long-eared Bat          1
Name: commonname, dtype: int64

Using merge to have both on the same plot for the sorted hib and roost months and will rename the count columns

In [153]:
whole_hib=pd.DataFrame(winter)
whole_hib.reset_index()
whole_hib.columns
whole_hib = whole_hib.rename(columns = {'commonname' : 'HibernationCount'})
whole_hib.index.names = ['commonname']
whole_hib.reset_index(level=0, inplace=True)
In [154]:
whole_roost=pd.DataFrame(summer)
whole_roost.reset_index()
whole_roost.columns
whole_roost = whole_roost.rename(columns = {'commonname' : 'RoostCount'})
whole_roost.index.names = ['commonname']
whole_roost.reset_index(level=0, inplace=True)
In [155]:
whole_merged = pd.merge(whole_hib, whole_roost, on=['commonname'])
In [156]:
whole_merged.head()
Out[156]:
commonname HibernationCount RoostCount
0 Natterer's Bat 3063 1150
1 Lesser Horseshoe Bat 2930 3080
2 Daubenton's Bat 2325 461
3 Brown Long-eared Bat 1769 1625
4 Greater Horseshoe Bat 1224 558
In [157]:
#Adding colours
from itertools import cycle, islice
merge_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(whole_merged)))
In [158]:
whole_merged.plot.bar(x='commonname',subplots=True,color=['blue','red'], figsize=(16,10), fontsize=12)
plt.xlabel('Common Name', fontsize=18)
plt.ylabel('Total Count',fontsize=18)
Out[158]:
<matplotlib.text.Text at 0xa60db5cc>
In [159]:
#Winter and summer counts compared
fig = plt.figure(figsize = (20,10))

# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
TheAxes = fig.add_subplot(111)
grouped2.plot(kind='bar',ax=TheAxes, ylim=[0,5000], color='blue',fontsize=15)
TheAxes.set_ylabel('Total Winter Counts',fontsize=20)
TheAxes.set_xlabel('Common Name',fontsize=20)
# second axes object representing the extra counts from the roost dataset, in the same place (twinned 
# with the SummerAxes) shared (twiny() would allow sharing the y-axes).
OtherAxes = TheAxes.twinx()


# plot onto the ExtrasAxes:
grouped3.plot.line(ax=OtherAxes,ylim=[0,5000], color='red',fontsize=15)
# set the Extras Axes y-axis label,
OtherAxes.set_ylabel('Total Summer Counts',fontsize=20)

# The mpatches.Patch() from matplotlib creates an element that can 
# be used in the handles parameter of the plt.legend() function to create an entry in the legend.
import matplotlib.patches as mpatches
legendpatch_list = ([mpatches.Patch(color='blue', label='Total Winter Counts')]+
                    [mpatches.Patch(color='red', label='Total Summer Counts')])
#the x,y values for the location are figure coordinates between 0 and 1.
#  If you exceed 1, the legend is drawn outside the figure as shown.
plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)

# and a title
plt.title('Winter and Summer counts compared 1999-2013', fontsize=20)
#Here we start to see a real difference in counts which takes me to my Question 2 which is comparing the numbers of hibernating
#bats with the numbers of roosting bats. 
Out[159]:
<matplotlib.text.Text at 0xa661d3ac>
In [160]:
scipy.stats.pearsonr(summer,winter)
Out[160]:
(0.98369084364741599, 4.5413853151850362e-11)
In [161]:
#plotting correlation between summer and winter counts
fig = plt.figure(figsize = (12,10))

plt.scatter(winter,summer, color=['red','blue'], marker='h',s =50)

axes = plt.gca()
m, b = np.polyfit(winter,summer, 1)
plt.annotate('Outlier', xy=(2950,3000), xytext=(3100,3200), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))
plt.annotate('Outlier', xy=(3100,4500), xytext=(3200,4800), 
             arrowprops=dict(facecolor='black', shrink=0.05, width=8))

X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100)
plt.plot(X_plot, m*X_plot + b, '-', color='black')
plt.title('Correlation of Roost and Hibernation Counts 1999-2013', fontsize=20, color='Black')
legendpatch_list = ([mpatches.Patch(color='red', label='Roost Counts')]+
                    [mpatches.Patch(color='blue', label='Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.6,0.2), fontsize=15)
plt.xlabel('Bat Counts (Hibernation)', fontsize=15)
plt.ylabel('Bat Counts (Roost)', fontsize=15)
#We can see a strong correlation between the roost and hibernation datasets
Out[161]:
<matplotlib.text.Text at 0xa694092c>

So how many bats make it out of hibernation? I would have to take individual years of data, so if I look at winter 2009 - 2010 then I would compare that to summer 2010 I could do this for a few examples and then could plot them together.

In [162]:
#Having a look at the pattern of the cross-over data, where the bats are coming out of hibernation
cross_months_df = Roost_data[Roost_data['month'].isin([3,4])]
over_months_df = Hibernation_data[Hibernation_data['month'].isin([3,4])]
In [163]:
#Combining the extra months in the roost dataset with the extra months in the hibernation dataset 
crossover_months = [cross_months_df, over_months_df]
crossover_df = pd.concat(crossover_months)
crossover_df = crossover_df[crossover_months[0].columns]
crossover_df.head()
Out[163]:
observationid recordkey gridreference year month latitude longitude commonname
11 463191184 15 SO98 2010 4 52.46741971 -2.067667717 Lesser Horseshoe Bat
69 463186258 106 SD46 2010 3 54.08668815 -2.83460701 Common Pipistrelle
88 463189133 125 ST70 2010 3 50.84849565 -2.349384444 Lesser Horseshoe Bat
93 463199177 138 SX71 2010 4 50.04102895 -3.739919406 Lesser Horseshoe Bat
95 463204857 140 SX70 2010 4 49.95108655 -3.736676733 Lesser Horseshoe Bat
In [164]:
crossover_df['commonname'].value_counts()
Out[164]:
Lesser Horseshoe Bat      896
Greater Horseshoe Bat     344
Natterer's Bat            342
Daubenton's Bat           236
Common Pipistrelle        221
Brown Long-eared Bat      212
Soprano Pipistrelle       119
Pipistrelle               116
Whiskered/Brandt's Bat     89
Whiskered Bat              25
Brandt's Bat               14
Serotine                   12
Bechstein's Bat             8
Noctule Bat                 3
Name: commonname, dtype: int64
In [165]:
#grouping to plot
groupeddata4 = crossover_df.groupby(['commonname'])
grouped4 = groupeddata4['commonname'].aggregate('count')
groupeddata4.size()
Out[165]:
commonname
Bechstein's Bat             8
Brandt's Bat               14
Brown Long-eared Bat      212
Common Pipistrelle        221
Daubenton's Bat           236
Greater Horseshoe Bat     344
Lesser Horseshoe Bat      896
Natterer's Bat            342
Noctule Bat                 3
Pipistrelle               116
Serotine                   12
Soprano Pipistrelle       119
Whiskered Bat              25
Whiskered/Brandt's Bat     89
dtype: int64
In [166]:
#plotting the crossover values
grouped4.plot(kind='bar',figsize = (20,10))
plt.title('Total crossover Bats Spring', fontsize=20, color='Blue')
plt.xlabel('Bat Name')
plt.ylabel('Total Count')
Out[166]:
<matplotlib.text.Text at 0xa69838ac>
In [167]:
#Having a look at the pattern of the cross-over data, where the bats are coming out of hibernation
cross_monthsaut_df = Roost_data[Roost_data['month'].isin([8,9,10,11,12])]
over_monthsaut_df = Hibernation_data[Hibernation_data['month'].isin([8,9,10,11,12])]
In [168]:
#Combining the summer months in the roost dataset with the summer months in the hibernation dataset 
crossover_months_autumn = [cross_monthsaut_df, over_monthsaut_df]
crossover_aut_df = pd.concat(crossover_months_autumn)
crossover_aut_df = crossover_aut_df[crossover_months_autumn[0].columns]
crossover_aut_df.head()
Out[168]:
observationid recordkey gridreference year month latitude longitude commonname
2 463199389 4 TQ97 2010 10 51.478886 0.813687061 Serotine
6 463209420 8 TR12 2010 9 51.02961366 1.071575094 Brown Long-eared Bat
8 463195965 11 TR16 2010 8 51.38929986 1.095592557 Brown Long-eared Bat
12 463203521 16 SJ78 2010 10 53.36579521 -2.369647145 Common Pipistrelle
14 463200747 18 TR16 2010 8 51.38929986 1.095592557 Common Pipistrelle
In [169]:
#grouping the autumn months to plot
groupeddata10 = crossover_aut_df.groupby(['commonname'])
grouped10 = groupeddata10['commonname'].aggregate('count')
groupeddata10.size()
Out[169]:
commonname
Bechstein's Bat             28
Brandt's Bat                34
Brown Long-eared Bat      1122
Common Pipistrelle        1343
Daubenton's Bat           1088
Greater Horseshoe Bat     1353
Grey Long-eared Bat          2
Lesser Horseshoe Bat      1839
Natterer's Bat            1548
Noctule Bat                 13
Pipistrelle                627
Serotine                   243
Soprano Pipistrelle        941
Whiskered Bat               60
Whiskered/Brandt's Bat     280
dtype: int64
In [170]:
#spring months
grouped10.plot(kind='bar',figsize = (20,10))
plt.title('Total crossover Bats Spring', fontsize=20, color='Blue')
plt.xlabel('Bat Name')
plt.ylabel('Total Count')
Out[170]:
<matplotlib.text.Text at 0xa63a2dec>
In [171]:
#comparison of spring and autumn counts
fig = plt.figure(figsize = (20,10))

# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
Axes1 = fig.add_subplot(111)
grouped4.plot(kind='bar',ax=Axes1, ylim=[0,3000], color='blue',fontsize=15)
Axes1.set_ylabel('Extra Spring Counts',fontsize=20)
# second axes object representing the extra counts from the roost dataset, in the same place (twinned 
# with the SummerAxes) shared (twiny() would allow sharing the y-axes).
OAxes1 = Axes1.twinx()


# plot onto the ExtrasAxes:
grouped10.plot.line(ax=OAxes1,ylim=[0,3000], color='red',fontsize=15 )
# set the Extras Axes y-axis label,
OAxes1.set_ylabel('Extra Autumn Counts',fontsize=20)

# The mpatches.Patch() from matplotlib creates an element that can 
# be used in the handles parameter of the plt.legend() function to create an entry in the legend.
import matplotlib.patches as mpatches
legendpatch_list = ([mpatches.Patch(color='blue', label='Extra Spring Counts')]+
                    [mpatches.Patch(color='red', label='Extra Autumn Counts')])
#the x,y values for the location are figure coordinates between 0 and 1.
#  If you exceed 1, the legend is drawn outside the figure as shown.
plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)

# and a title
plt.title('Extra Spring and Autumn Counts Compared', 
          fontsize=20)
#Here we start to see a real difference in counts which takes me to my Question 2 which is comparing the numbers of hibernating
#bats with the numbers of roosting bats. 
Out[171]:
<matplotlib.text.Text at 0xa62ca78c>
In [172]:
#crossover autumn months
grouped10.plot(figsize = (20,10))
plt.title('Total crossover Bats Autumn', fontsize=20, color='Blue')
plt.xlabel('Bat Name')
plt.ylabel('Total Count')
Out[172]:
<matplotlib.text.Text at 0xa62e16cc>
In [173]:
#How does this compare with the months of Jan and Feb when according to the Bat Conservation Trust most of the 
#counting takes place?
hib_months_df = Hibernation_data[Hibernation_data['month'].isin([1,2])]
ros_months_df = Roost_data[Roost_data['month'].isin([1,2])]
In [174]:
#Combining the study months in the roost dataset with the study months in the hibernation dataset 
test_months = [hib_months_df, ros_months_df]
study_df = pd.concat(test_months)
study_df = study_df[test_months[0].columns]
study_df.head()
Out[174]:
observationid recordkey gridreference year month latitude longitude commonname
0 463169251 1 TQ95 2008 1 51.29905055 0.802680508 Bechstein's Bat
3 463167926 5 TQ95 2008 1 51.29905055 0.802680508 Daubenton's Bat
5 463172436 8 TR15 2008 1 51.29938038 1.089541645 Natterer's Bat
6 463172438 10 TR15 2008 2 51.29938038 1.089541645 Natterer's Bat
7 463172439 11 TR15 2008 2 51.29938038 1.089541645 Brown Long-eared Bat
In [175]:
#grouping the study months
groupeddata5 = study_df.groupby(['commonname'])
grouped5 = groupeddata5['commonname'].aggregate('count')
groupeddata5.size()
Out[175]:
commonname
Bechstein's Bat             46
Brandt's Bat                86
Brown Long-eared Bat      1769
Common Pipistrelle         668
Daubenton's Bat           2325
Greater Horseshoe Bat     1224
Grey Long-eared Bat          1
Lesser Horseshoe Bat      2930
Natterer's Bat            3063
Noctule Bat                 14
Pipistrelle                856
Serotine                    77
Soprano Pipistrelle        357
Whiskered Bat              135
Whiskered/Brandt's Bat     867
dtype: int64
In [176]:
names = []
for i in study_df.commonname:
    if i not in names:
        names.append(i)
names
Out[176]:
["Bechstein's Bat",
 "Daubenton's Bat",
 "Natterer's Bat",
 'Brown Long-eared Bat',
 "Whiskered/Brandt's Bat",
 'Greater Horseshoe Bat',
 'Lesser Horseshoe Bat',
 'Common Pipistrelle',
 'Soprano Pipistrelle',
 'Pipistrelle',
 'Whiskered Bat',
 "Brandt's Bat",
 'Serotine',
 'Noctule Bat',
 'Grey Long-eared Bat']
In [177]:
#Adding colours
from itertools import cycle, islice
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue']), None, len(grouped5)))
#plotting the total hibernation
grouped5.plot(kind='bar',figsize = (18,12),color=my_colors)
plt.title('Total hibernation study Bats', fontsize=25, color='darkblue')
plt.xlabel('Bat Name', fontsize=15)
plt.ylabel('Total Count',fontsize=15)
Out[177]:
<matplotlib.text.Text at 0xa62bfaac>

It would appear that the cross-over months have small values in comparison to the hibernation and roost and this would be explained study periods. This would suggest that most of the counts are taking place in the correct months. The erroneous values from each dataset have been consolidated into the months as described in the study details on the Bat Conservation Trust website and these combined and cleaned datasets will be used to attempt to answer the question of how the bats are surviving the hibernation process. If I took this study further it would be to carry out analysis on how the wether patters affect this process and whether the numbers or types of bats (or both) were affected by particular weather conditions. It could well be similar to birds where they may start producing babies too early if late winter is warm and then bad weather hits again and the babaies don't survive. If this was to be studied I would have to look at the numbers counted coming out of hibernation the next year, to see if the babies had survived the previous year and were in the counts for this year.

2010

For this analysis: I will select individual years of data, so if I look at winter 2010 then I would compare that to summer 2010 I could do this for a few examples and then could plot them together. This should provide an indication of the patterns of bat numbers of the hibernating counts and the roost counts.

In [178]:
#Combining the roost dataset with the hibernation dataset 2010
totals2010 = [One_year, RoostSelectYear_df]
totals2010_df = pd.concat(totals2010)
totals2010_df= totals2010_df[totals2010[0].columns]
totals2010_df.head()
Out[178]:
observationid recordkey gridreference year month latitude longitude commonname
1728 463173795 1925 SS53 2010 1 51.11927932 -4.065889493 Lesser Horseshoe Bat
1729 463173796 1926 SS53 2010 1 51.11927932 -4.065889493 Greater Horseshoe Bat
1730 463173797 1927 SS53 2010 1 51.11927932 -4.065889493 Natterer's Bat
1731 463176374 1928 SS53 2010 1 51.11927932 -4.065889493 Lesser Horseshoe Bat
1732 463178159 1929 SS54 2010 1 51.20920073 -4.069910501 Greater Horseshoe Bat
In [179]:
#grouping to plot
totals2010_df = totals2010_df[['commonname','year']]
Combined10 = totals2010_df.pivot_table(index=['commonname'], aggfunc='count')
Combined10 = Combined10.rename(columns = {'year' : 'counts'})
Combined10.columns
Out[179]:
Index(['counts'], dtype='object')
In [180]:
#Total Bats 2010
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(grouped5)))
Combined10.plot.pie(y='counts',autopct='%.2f',labels=['','','','','','','','','','','','','',''],
                    figsize=(14, 18),colors=my_colors,fontsize=16)
#Combined10.plot(y='counts',kind='pie', fontsize=15, colors=my_colors,figsize =(20,20))
plt.axis('equal')
plt.ylabel('')
plt.legend(labels=Combined10.index,loc=(0.8,0.82), fontsize=14)
plt.title('Total Bats 2010', fontsize=20, color='black')
plt.show()

One_year, RoostSelectYear_df gives one year (2010) of data to compare

In [181]:
#grouping to compare counts
hibyear = One_year.groupby(['commonname'])
hibyeargroup = hibyear['commonname'].aggregate('count')
hibyear.size()
Out[181]:
commonname
Bechstein's Bat             2
Brandt's Bat                5
Brown Long-eared Bat      120
Common Pipistrelle         22
Daubenton's Bat           141
Greater Horseshoe Bat      74
Lesser Horseshoe Bat      169
Natterer's Bat            258
Pipistrelle                23
Serotine                    5
Soprano Pipistrelle         4
Whiskered Bat              14
Whiskered/Brandt's Bat     54
dtype: int64
In [182]:
stats = hibyeargroup.describe()
stats
Out[182]:
count     13.000000
mean      68.538462
std       80.627348
min        2.000000
25%        5.000000
50%       23.000000
75%      120.000000
max      258.000000
Name: commonname, dtype: float64
In [183]:
#grouping to compare counts
roostyear = RoostSelectYear_df.groupby(['commonname'])
roostyeargroup = roostyear['commonname'].aggregate('count')
roostyear.size()
Out[183]:
commonname
Brandt's Bat                3
Brown Long-eared Bat      117
Common Pipistrelle        319
Daubenton's Bat             8
Greater Horseshoe Bat      25
Lesser Horseshoe Bat      158
Natterer's Bat             52
Noctule Bat                 2
Pipistrelle                70
Serotine                   43
Soprano Pipistrelle       230
Whiskered Bat               5
Whiskered/Brandt's Bat      3
dtype: int64
In [184]:
stats1 = roostyeargroup.describe()
stats1
Out[184]:
count     13.000000
mean      79.615385
std      100.457734
min        2.000000
25%        5.000000
50%       43.000000
75%      117.000000
max      319.000000
Name: commonname, dtype: float64

Correlate

In [185]:
#Hibernation and Roost counts compared 2010
fig = plt.figure(figsize = (20,10))
# axes object for first subplot.
# 111 is a grid of 1 by 1 subplots
Axes = fig.add_subplot(111)
hibyeargroup.plot(kind='bar',ax=Axes, ylim=[0,500], color='blue',fontsize=15)
Axes.set_ylabel('2010 Hibernation Counts',fontsize=20)
# second axes object representing the extra counts from the roost dataset, in the same place (twinned 
# with the SummerAxes) shared (twiny() would allow sharing the y-axes).
OAxes = Axes.twinx()


# plot onto the ExtrasAxes:
roostyeargroup.plot.line(ax=OAxes,ylim=[0,500], color='red',fontsize=15 )
# set the Extras Axes y-axis label,
OAxes.set_ylabel('2010 Roost Counts',fontsize=20)

# The mpatches.Patch() from matplotlib creates an element that can 
# be used in the handles parameter of the plt.legend() function to create an entry in the legend.
import matplotlib.patches as mpatches
legendpatch_list = ([mpatches.Patch(color='blue', label='2010 Hibernation Counts')]+
                    [mpatches.Patch(color='red', label='2010 Roost Counts')])
#the x,y values for the location are figure coordinates between 0 and 1.
#  If you exceed 1, the legend is drawn outside the figure as shown.
plt.legend(handles=legendpatch_list, loc=(1.1,0.7), fontsize=20)

# and a title
plt.title('2010 Hibernation and Roost counts compared', 
          fontsize=20)
#Here we start to see a real difference in counts which takes me to my Question 2 which is comparing the numbers of hibernating
#bats with the numbers of roosting bats. 
Out[185]:
<matplotlib.text.Text at 0xa615ab0c>

It would seem that because there are different bats studied that this question would have to be answered with a specific study of particular bats. I can correlate the numbers but would have to take into account that these are different species of bat that I am comparing. However it may be that it is difficult to tell anyway and that the classifications made in the data collections have errors. Therefore taking the total count and comparing it may produce a guide of sorts.

In [186]:
pd.options.mode.chained_assignment = None  # default='warn'
In [187]:
#counts hibernation
counting=One_year.commonname.value_counts()
len(counting)
Out[187]:
13
In [188]:
#counts roost
counting1 = RoostSelectYear_df.commonname.value_counts()
len(counting1)
Out[188]:
13
In [189]:
diff =  counting1 - counting

Percentages rise in bat counts 1999-2013 and difference between roost and hibernation. Look at the bats of concern and try a histogram. Remember the map to illustrate. Find how to label points on a scatter plot. I chose the roost counts because I felt they would be most closely matched to the hibernation survey "As a general rule, trends calculated from the Field and Waterway Survey are considered to be most robust,followed by the Hibernation Survey and then the Roost Counts." The grey long-eared bat is one of the rarest bats in the UK, with a population estimated at 1,000 individuals and a distribution that is restricted mainly to the southern coast of England and Wales. Identify and monitor maternity roosts and hibernation sites is one of the ways of monitoring this species.

In [190]:
#removing to compare counts
One_year = One_year.drop(One_year[One_year.commonname == 'Noctule Bat'].index)
pd.unique(One_year.commonname)
Out[190]:
array(['Lesser Horseshoe Bat', 'Greater Horseshoe Bat', "Natterer's Bat",
       'Brown Long-eared Bat', "Daubenton's Bat", 'Whiskered Bat',
       "Brandt's Bat", 'Pipistrelle', 'Common Pipistrelle',
       "Whiskered/Brandt's Bat", "Bechstein's Bat", 'Serotine',
       'Soprano Pipistrelle'], dtype=object)
In [191]:
new_hib= pd.DataFrame(counting)
In [192]:
#correalting the roost and hibernation counts
fig = plt.figure(figsize = (12,10))
plt.scatter(counting, counting1,color=['red','blue'],s=60)
axes = plt.gca()
m, b = np.polyfit(counting, counting1, 1)
X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100)
plt.plot(X_plot, m*X_plot + b, '-', color='black')

plt.title('Correlation of Roost and Hibernation Counts 2010', fontsize=15, color='Black')
legendpatch_list = ([mpatches.Patch(color='red', label='Roost Counts')]+
                    [mpatches.Patch(color='blue', label='Hibernation Counts')])

plt.legend(handles=legendpatch_list, loc=(0.6,0.2), fontsize=12)
plt.xlabel('Bat Counts (Hibernation)', fontsize=15)
plt.ylabel('Bat Counts (Roost)', fontsize=15)
#We can see a strong correlation between the roost and hibernation datasets
Out[192]:
<matplotlib.text.Text at 0xa60fe04c>
In [193]:
scipy.stats.pearsonr(counting,counting1)
Out[193]:
(0.99024652048030692, 9.10646391995923e-11)

This shows good correlation which would suggest that overall the bats survived the winter well, however this also takes into account new babies born, but it shows overall the survival rate was good. This one looks better than the one earlier in investigations.

In [194]:
#getting names
names = pd.unique(Combined_df.commonname)
names
Out[194]:
array(['Serotine', 'Brown Long-eared Bat', 'Common Pipistrelle',
       'Soprano Pipistrelle', 'Lesser Horseshoe Bat', 'Pipistrelle',
       "Natterer's Bat", "Daubenton's Bat", 'Greater Horseshoe Bat',
       'Noctule Bat', 'Whiskered Bat', "Brandt's Bat",
       "Whiskered/Brandt's Bat", 'Grey Long-eared Bat', "Bechstein's Bat"], dtype=object)
In [195]:
StudyBats= SelectYears_df.copy()
In [196]:
StudyBats = StudyBats.drop(StudyBats[StudyBats.commonname  == 'Greater Horseshoe Bat'].index)
StudyBats = StudyBats.drop(StudyBats[StudyBats.commonname  == 'Whiskered Bat'].index)
StudyBats = StudyBats.drop(StudyBats[StudyBats.commonname  == "Brandt's Bat"].index)
StudyBats = StudyBats.drop(StudyBats[StudyBats.commonname  == "Whiskered/Brandt's Bat"].index) 
StudyBats = StudyBats.drop(StudyBats[StudyBats.commonname  == 'Grey Long-eared Bat'].index)
StudyBats = StudyBats.drop(StudyBats[StudyBats.commonname  == "Bechstein's Bat"].index)
pd.unique(StudyBats.commonname)
#removing bats not in the main studies by JNCC
Out[196]:
array(["Daubenton's Bat", "Natterer's Bat", 'Brown Long-eared Bat',
       'Lesser Horseshoe Bat', 'Common Pipistrelle', 'Soprano Pipistrelle',
       'Pipistrelle', 'Serotine', 'Noctule Bat'], dtype=object)
In [197]:
#Getting the counts per year to calculate the percentage change

StudyBats = StudyBats[['commonname','year']]
pivot = StudyBats.pivot_table(index=['year'], aggfunc='count')
pivot.reset_index(['year','commonname'],inplace=True)
pivot.columns
Out[197]:
Index(['year', 'commonname'], dtype='object')
In [198]:
#plotting the counts to compare with the study
pivot.plot(kind = 'bar',x='year', y='commonname', figsize = (14,10),color='lightblue',ylim=[0,1000])
plt.title('Number of bats from the C8 Mammals Study Groups', fontsize=20, color='black')
plt.show()
#http://jncc.defra.gov.uk/page-4271

The results look to be around a 50% increase in numbers from 1999 to 2013 but the percentage calculations will show whether this is correct.

Not a good visulaisation

In [199]:
#Getting the counts per year to calculate the percentage change
SelectYears_df = SelectYears_df[['commonname','year']]
pivoted = SelectYears_df.pivot_table(index=['year'], aggfunc='count')
pivoted.reset_index(['year','commonname'],inplace=True)
pivoted = pivoted.rename(columns = {'commonname' : 'counts'})
pivoted.columns
Out[199]:
Index(['year', 'counts'], dtype='object')
In [200]:
whole_merged['Colour'] = 'white'
whole_merged.loc[(whole_merged.commonname == 'Soprano Pipistrelle'),['Colour']] = 'dodgerblue'  
whole_merged.loc[(whole_merged.commonname == 'Pipistrelle'),['Colour']] = '#191970' 
whole_merged.loc[(whole_merged.commonname == 'Common Pipistrelle'),['Colour']] = 'slateblue' 
whole_merged.loc[(whole_merged.commonname == 'Brown Long-eared Bat'),['Colour']] = '#00FF80' 
whole_merged.loc[(whole_merged.commonname == "Brandt's Bat"),['Colour']] = 'red' 
whole_merged.loc[(whole_merged.commonname == 'Noctule Bat'),['Colour']] = 'blue' 
whole_merged.loc[(whole_merged.commonname == "Bechstein's Bat"),['Colour']] = 'cyan' 
whole_merged.loc[(whole_merged.commonname == 'Greater Horseshoe Bat'),['Colour']] = '#00C69C' 
whole_merged.loc[(whole_merged.commonname == 'Serotine'),['Colour']] = '#00AAAA' 
whole_merged.loc[(whole_merged.commonname == "Daubenton's Bat"),['Colour']] = '#008DB8'
whole_merged.loc[(whole_merged.commonname == "Whiskered/Brandt's Bat"),['Colour']] = 'yellow' 
whole_merged.loc[(whole_merged.commonname == 'Grey Long-eared Bat'),['Colour']] = 'grey' 
whole_merged.loc[(whole_merged.commonname == 'Lesser Horseshoe Bat'),['Colour']] = '#00E28E' 
whole_merged.loc[(whole_merged.commonname == 'Noctule Bat'),['Colour']] = 'green' 
whole_merged.loc[(whole_merged.commonname == "Natterer's Bat"),['Colour']] = 'black' 
whole_merged.loc[(whole_merged.commonname == 'Whiskered Bat'),['Colour']] = '#0038E2'
whole_merged.head()
Out[200]:
commonname HibernationCount RoostCount Colour
0 Natterer's Bat 3063 1150 black
1 Lesser Horseshoe Bat 2930 3080 #00E28E
2 Daubenton's Bat 2325 461 #008DB8
3 Brown Long-eared Bat 1769 1625 #00FF80
4 Greater Horseshoe Bat 1224 558 #00C69C
In [201]:
pivoted.plot.scatter(x='year', y='counts', figsize = (14,10), s=pivoted['counts'],color=(0, 0, 1),ylim=[0,1000])
plt.ticklabel_format(useOffset=False, style='plain')
plt.xticks(pivoted['year'])
plt.title('Number of bats each year', fontsize=20, color='black')
Out[201]:
<matplotlib.text.Text at 0xa3a0c68c>

This represents the overall bat counts for the period 1999 - 2013. I calculated the percentage change each year and the overall percentage change below.

In [202]:
#grouping combined data to plot
Combined = Combined_df[['commonname','year']]
dpivoted = Combined.pivot_table(index=['commonname'], aggfunc='count')
dpivoted = dpivoted.rename(columns = {'year' : 'counts'})
dpivoted.columns
Out[202]:
Index(['counts'], dtype='object')
In [203]:
dpivoted
Out[203]:
counts
commonname
Bechstein's Bat 42
Brandt's Bat 102
Brown Long-eared Bat 2468
Common Pipistrelle 3796
Daubenton's Bat 1870
Greater Horseshoe Bat 1086
Grey Long-eared Bat 2
Lesser Horseshoe Bat 3693
Natterer's Bat 2942
Noctule Bat 31
Pipistrelle 1827
Serotine 664
Soprano Pipistrelle 2590
Whiskered Bat 141
Whiskered/Brandt's Bat 662

The headline measure is a composite index of eight bat species: serotine, Daubenton's bat, Natterer’s bat, noctule, common pipistrelle, soprano pipistrelle, brown long-eared bat, and lesser horseshoe bat. http://jncc.defra.gov.uk/page-4271

In [204]:
#plotting the total in the bat study
my_colors = list(islice(cycle(['dodgerblue', 'cornflowerblue', 'skyblue', 'slateblue','#191970', '#001CF0', '#0038E2', 
                               '#0055D4', '#0071C6', '#008DB8', '#00AAAA',
                                '#00C69C', '#00E28E', '#00FF80']), None, len(grouped5)))
#dpivoted.plot.pie(y='counts', figsize=(15, 15), colors=my_colors)
dpivoted.plot.pie(y='counts',autopct='%.2f',labels=['','','','','','','','','','','','','','',''],
                    figsize=(14, 18),colors=my_colors,fontsize=16)
plt.axis('equal')
plt.ylabel('')
plt.legend(labels=dpivoted.index,loc=(0.8,0.82), fontsize=14)
plt.title('Total Bats 1999 - 2013', fontsize=20, color='black')
plt.show()
In [205]:
#Totals per year
grpComb['totals'] = grpComb.sum(axis=1)
In [206]:
#Plotting the totals for each year between 1999 and 2013
grpComb['totals'].plot(kind='bar')
Out[206]:
<matplotlib.axes._subplots.AxesSubplot at 0xa3a0a20c>
In [207]:
print(grpComb.totals)
year
1999    1111
2000    1113
2001     966
2002    1277
2003    1276
2004    1380
2005    1388
2006    1439
2007    1427
2008    1705
2009    1799
2010    1926
2011    1789
2012    1570
2013    1750
Name: totals, dtype: float64

Percentages

Calculating the annual percentages to determine is the changes represent a similar pattern to the Defra report as described above. C8. Mammals of the wider countryside (bats) http://jncc.defra.gov.uk/page-4271

In [208]:
def percentIncrease(x,y):
    z = ((y-x)/x)*100
    return z
In [209]:
x=2222
y=2226
percentIncrease(x,y)
Out[209]:
0.18001800180018002
In [210]:
x=2226
y=1932
percentIncrease(x,y)
Out[210]:
-13.20754716981132
In [211]:
x=1932
y=2554
percentIncrease(x,y)
Out[211]:
32.19461697722567
In [212]:
x=2554
y=2552
percentIncrease(x,y)
Out[212]:
-0.07830853563038372
In [213]:
x=2552
y=2760
percentIncrease(x,y)
Out[213]:
8.150470219435736
In [214]:
x=2760
y=2776
percentIncrease(x,y)
Out[214]:
0.5797101449275363
In [215]:
x=2776
y=2878
percentIncrease(x,y)
Out[215]:
3.674351585014409
In [216]:
x=2878
y=2854
percentIncrease(x,y)
Out[216]:
-0.8339124391938846
In [217]:
x=2854
y=3410
percentIncrease(x,y)
Out[217]:
19.48142957252978
In [218]:
x=3410
y=3598
percentIncrease(x,y)
Out[218]:
5.513196480938416
In [219]:
x=3598
y=3852
percentIncrease(x,y)
Out[219]:
7.0594774874930515
In [220]:
x=3852
y=3578
percentIncrease(x,y)
Out[220]:
-7.113187954309449
In [221]:
x=3578
y=3140
percentIncrease(x,y)
Out[221]:
-12.24147568474008
In [222]:
x=3140
y=3500
percentIncrease(x,y)
Out[222]:
11.464968152866243
In [223]:
x=2222
y=3500
percentIncrease(x,y)
Out[223]:
57.515751575157516

Adding the annual percentage changes to a dataframe to plot.

In [224]:
percentChange = {'Year':[2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010, 2011,2012,2013],
                'Change':[0.17969451931716085,15.217391304347828,24.353954581049333,0.07836990595611285,7.536231884057972,
                          0.5763688760806917,
          3.5441278665740095,0.8409250175192713,16.3049853372434,5.225125069483046,6.5939771547248185,7.657909446618222,
          13.949044585987261,10.285714285714285]}
df = pd.DataFrame(percentChange)
df.plot(x='Year', y='Change')
Out[224]:
<matplotlib.axes._subplots.AxesSubplot at 0xa37c68ec>

This plot does not reveal useful information. It shows that there were some drops around 2003 - 2007, but that the trends seem to have stabilised. I found that my data showed a percentage change of 57.5% over the period 1999 to 2013 inclusive, compared to the JNCC report wich showed an overall 23% increase. However the data they have used is complete and has actual counts of individual bats rather than just sightings. The patterns and trends however are similar and the overall story of bats in the UK over the period is consistent in patterns if not percentages.

Weather information used as a guide

The counts for 2001 are lower than those for 1999 and 2000 but this is likely to be due to severe winter weather as reported on the Met Office website at: http://www.metoffice.gov.uk/climate/uk/summaries/2001/february February 2001

*The averaging period used for the following assessment was 1961-1990.

*UK overview

*Wet weather returned to eastern parts of England at the start. As high pressure began to dominate it became drier with sunshine by day and frost at night. Turned cold towards the end with quite significant snowfalls in Scotland and northern England.

*In Scotland, heavy snow at the start and end of the month brought power cuts and disrupted road, rail and air transport.

This would mean that it is still acceptable to treat the period as 1999 - 2013

2010 http://www.metoffice.gov.uk/climate/uk/summaries/2011/winter Winter 2010/11

The following represents an assessment of the weather experienced across the UK during winter 2010 / 2011 (December 2010 to February 2011) and how it compares with the 1981 to 2010 averages.

Mean temperatures over the UK were 4.8 °C below average during December, 0.6 °C below average in January and 1.7 °C above average in February. The UK mean temperature for the winter as a whole was 2.4 °C, making it less cold than winter 2009/10 which was 1.6 °C but still the second-coldest winter since 1985/86 with 2.3 °C. Over Scotland and Northern Ireland it was the second-coldest winter since 1985/86 and 1978/79 respectively, with again only last winter having been colder. Over Northern Ireland it was equal sixth-coldest winter in the series from 1910. Over England and Wales it was the second-coldest since 1995/96, with only last winter having been colder.

December was exceptionally cold across the UK; the coldest December in over 100 years, with the highest number of air frosts in at least the last 50 years. It was the coldest calendar month since February 1986, and in Northern Ireland the coldest calendar month of the last 100 years. Chilly conditions persisted in early January, before a milder spell around mid-month then a return to colder weather. Mean temperatures finished somewhat below normal. Temperatures were generally above average in February, making it the mildest February since 2002. The number of air frosts was the fourth lowest in February in the last 50 years.

Precipitation amounts over the UK were well below average during both December (40% of average) and January (84%) but above in February (129%). In December, there were widespread snowfalls in the first week and from mid-month until Christmas. Iit was the driest December over the UK since 1963 and the third driest in the 100-year series. February was particularly wet in parts of northern England and southern Scotland, where it was one of the wettest Februarys in the last 100 years.

There was a fall in numbers in 2012 and the Met Office has this for that year:

http://www.metoffice.gov.uk/climate/uk/summaries/2013/winter

*Winter 2012/13 The following represents an assessment of the weather experienced across the UK during Winter 2012/2013 (December 2012 to February 2013) and how it compares with the 1981 to 2010 average.

*The mean temperature over the UK for winter was 3.3 °C which is 0.4 °C below the long term average. December was equal to the long term average for the month, January was 0.3 °C below, February was 0.9 °C below and at 2.8 °C was the coldest month of the season. Spells of notably mild weather occurred in late December and early January, and notably cold weather in early December, mid to late January, and the latter part of February.

*Winter overall for the UK was marginally wetter than the long term average with 106%, although much of Highland Scotland was drier than average. It was the wettest December since 1999 with 149% of long term average rain; considerable disruption from flooding events occurred in the run-up to Christmas. January was slightly drier than average for the UK as a whole (91%), with a few localised exceptions in some coastal areas and Northern Ireland. February was also dry (68%). There was a period of widespread snowfall across much of the country from mid to late January as frontal systems hit colder air, causing considerable disruption. This was followed by a rapid thaw in the last few days of the month; snowmelt and further rain resulting in some further localised flooding. Further snow events in February were generally short-lived.

This would suggest that the wet weather may be unsuitable for bats and that the colder weather is better for them. Perhaps this makes them better survivors than competitors and they perhaps fare better in colder weather meaning that animals who don't fare so well are not there to fight for food. This would mean more was available for bats.

In [225]:
grpComb.columns
Out[225]:
Index(['Bechstein's Bat', 'Brandt's Bat', 'Brown Long-eared Bat',
       'Common Pipistrelle', 'Daubenton's Bat', 'Greater Horseshoe Bat',
       'Grey Long-eared Bat', 'Lesser Horseshoe Bat', 'Natterer's Bat',
       'Noctule Bat', 'Pipistrelle', 'Serotine', 'Soprano Pipistrelle',
       'Whiskered Bat', 'Whiskered/Brandt's Bat', 'totals'],
      dtype='object')
In [ ]:
grpComb = grpComb[["Bechstein's Bat", "Brandt's Bat", 'Brown Long-eared Bat',
       'Common Pipistrelle', "Daubenton's Bat", "Greater Horseshoe Bat",
       'Grey Long-eared Bat', 'Lesser Horseshoe Bat', "Natterer's Bat",
       'Noctule Bat', 'Pipistrelle', 'Serotine', 'Soprano Pipistrelle',
       'Whiskered Bat', "Whiskered/Brandt's Bat"]]
In [ ]:
grpComb.plot.area(figsize=(15, 15));
In [ ]:
newgrp = grpComb.copy()
newgrp.drop('totals', axis=1)
In [ ]:
newgrp.columns
In [ ]:
figsize=(15, 15)
In [ ]:
MapBats = Combined_df[['latitude','longitude','commonname']]
In [ ]:
MapBats.to_csv('data/combinedBats.csv', index = False)
In [ ]:
Combined_df.columns
In [ ]:
year = list(newgrp.year)
In [ ]:
import matplotlib.patches as mpatches
# We're going to build the legend patch list one element at a time;
# it starts empty.
legendpatch_list = []
colours = {'Soprano Pipistrelle':'red', 'Common Pipistrelle':'red', "Brandt's Bat":'red',
       'Western Barbastelle':'red', 'Pipistrelle':'red', 'Brown Long-eared Bat':'red',
       "Daubenton's Bat":'red', 'Bat':'red', 'Serotine':'red', 'Greater Horseshoe Bat':'red',
       'Grey Long-eared Bat':'red', 'Long-eared Bat species':'red',
       'Lesser Horseshoe Bat':'red', "Bechstein's Bat":'red', "Whiskered/Brandt's Bat":'red',
       "Natterer's Bat":'red', 'Mouse-eared Bat':'red', 'Lesser Noctule':'red',
       'Whiskered Bat':'red', 'Myotis Bat species':'red', 'Noctule Bat':'red'}
# We want to loop for each Region

for key in pivoted:
    

    # for each type create the legend patch
    legendpatch_list = legendpatch_list+([mpatches.Patch(color=colours[key], label=key)])