Skip to content

Added ALA directory and data files #125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 11, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions data/ALADistinctValues/ALA_DwC_field_counts.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/usr/local/bin/bash
#
# Script to generate distinct values and counts for DwC fields in the ALA (CSV output)
# Requires the jq tool to be installed - see https://stedolan.github.io/jq/
# Requires BASH v.4+
#
# Author: Nick dos Remedios <[email protected]> 2018-07-04
#
# requested fields:
#
# basisOfRecord, continent, countrycode, country, day, month, year, disposition, establishmentMeans, geodeticDatum, georeferenceVerificationStatus, identificationQualifier, identificationVerificationStatus, islandGroup, island, language, license, lifeStage, nomenclaturalCode, occurrenceStatus, organismScope, preparations, reproductiveCondition, sex, taxonRank, taxonomicStatus, typeStatus, type, verbatimSRS, waterbody
# ALA SOLR fields
# basis_of_record continent country_code country day year disposition establishment_means geodeticDatum georeferenceVerificationStatus identificationQualifier identificationVerificationStatus islandGroup island language license lifeStage month nomenclaturalCode occurrenceStatus organismScope preparations reproductiveCondition sex taxonRank taxonomicStatus typeStatus type verbatimSRS waterbody

# Only using present in ALA SOLR index
# see https://biocache.ala.org.au/fields for ALA field names and indexed status, etc.

today=`date +%Y-%m-%d`
# Associative array - requires BASH version >= 4
declare -A DWCMAP
DWCMAP[basis_of_record]=basisOfRecord
DWCMAP[country_code]=countrycode
DWCMAP[country]=country
DWCMAP[month]=month
DWCMAP[year]=year
DWCMAP[establishment_means]=establishmentMeans
DWCMAP[raw_identification_qualifier]=identificationQualifier
DWCMAP[license]=license
DWCMAP[occurrence_status_s]=occurrenceStatus
DWCMAP[reproductive_condition_s]=reproductiveCondition
DWCMAP[raw_sex]=sex
DWCMAP[rank]=taxonRank
DWCMAP[type_status]=typeStatus
#for field in basis_of_record country_code country month year establishment_means raw_identification_qualifier license occurrence_status_s reproductive_condition_s raw_sex rank type_status

for field in "${!DWCMAP[@]}"
do
#echo "$field -> ${DWCMAP[$field]}"
curl -s "https://biocache.ala.org.au/ws/occurrence/facets?q=*:*&facets=${field}&flimit=999&fsort=index" | jq -r '.[] | .fieldResult[] | [.label,.count] | @csv' > ALA_distinct_${DWCMAP[$field]}_${today}.csv
done
12 changes: 12 additions & 0 deletions data/ALADistinctValues/ALA_distinct_basisOfRecord_2018-07-04.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
"FossilSpecimen",29174
"GenomicDNA",155317
"HumanObservation",58874775
"Image",136612
"Literature",575
"LivingSpecimen",158146
"MachineObservation",536143
"MaterialSample",170896
"NomenclaturalChecklist",5923
"PreservedSpecimen",12401282
"Sound",4565
"",2507006
272 changes: 272 additions & 0 deletions data/ALADistinctValues/ALA_distinct_country_2018-07-04.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
"Afghanistan",104
"Albania",13
"Algeria",940
"American Samoa",530
"Andorra",10
"Angola",445
"Antarctica",33569
"Antigua and Barbuda",14
"Argentina",4253
"Armenia",135
"Aruba",2
"Australia",67676376
"Austria",3477
"Azerbaijan",99
"Bahamas",678
"Bahrain",13
"Bangladesh",2769
"Barbados",106
"Belarus",51
"Belgium",686
"Belize",167
"Benin",164
"Bermuda",101
"Bhutan",1257
"Bolivia",772
"Bolivia, Plurinational State of",101
"Bonaire, Sint Eustatius and Saba",1
"Bosnia and Herzegovina",6
"Botswana",239
"Bouvet Island",49
"Brazil",6122
"British Indian Ocean Territory",767
"British Virgin Islands",5
"Brunei",895
"Brunei Darussalam",212
"Bulgaria",262
"Burkina Faso",16
"Burundi",61
"Cambodia",1000
"Cameroon",548
"Canada",9167
"Cape Verde",93
"Caspian Sea",19
"Cayman Islands",8
"Central African Republic",111
"Chad",7
"Chile",5456
"China",7144
"Christmas Island",14725
"Clipperton Island",16
"Cocos (Keeling) Islands",104
"Cocos Islands",1319
"Colombia",1882
"Comoros",91
"Congo",90
"Congo, the Democratic Republic of the",238
"Cook Islands",8664
"Costa Rica",1318
"Croatia",376
"Cuba",959
"Curaçao",1
"Cyprus",127
"Czech Republic",471
"Côte d'Ivoire",5
"C��te d'Ivoire",74
"Democratic Republic of the Congo",226
"Denmark",825
"Djibouti",24
"Dominica",104
"Dominican Republic",369
"East Timor",7999
"Ecuador",3237
"Egypt",1588
"El Salvador",27
"Equatorial Guinea",42
"Eritrea",91
"Estonia",576
"Ethiopia",381
"Falkland Islands",95
"Falkland Islands (Malvinas)",156
"Faroe Islands",33
"Fiji",12150
"Finland",2210
"France",6992
"French Guiana",266
"French Polynesia",5135
"French Southern Territories",1886
"Gabon",149
"Gambia",14
"Georgia",121
"Germany",8452
"Ghana",631
"Gibraltar",18
"Greece",1107
"Greenland",517
"Grenada",8
"Guadeloupe",95
"Guam",758
"Guatemala",209
"Guernsey",36
"Guinea",18
"Guinea-Bissau",1
"Guyana",533
"Haiti",128
"Heard Island and McDonald Islands",19807
"Holy See (Vatican City State)",7
"Honduras",320
"Hong Kong",1641
"Hungary",848
"Iceland",523
"India",10724
"Indonesia",144954
"Iran",107
"Iran, Islamic Republic of",147
"Iraq",255
"Ireland",652
"Isle of Man",15
"Israel",982
"Italy",5699
"Jamaica",723
"Japan",11427
"Jersey",62
"Jordan",34
"Kazakhstan",305
"Kenya",1512
"Kiribati",413
"Korea, Democratic People's Republic of",5
"Korea, Republic of",39
"Kuwait",50
"Kyrgyzstan",68
"Lao People's Democratic Republic",243
"Laos",4772
"Latvia",14
"Lebanon",135
"Lesotho",228
"Liberia",26
"Libya",57
"Libyan Arab Jamahiriya",25
"Liechtenstein",1
"Lithuania",49
"Luxembourg",48
"Macao",10
"Macedonia",8
"Macedonia, the former Yugoslav Republic of",15
"Madagascar",4581
"Malawi",417
"Malaysia",15769
"Maldives",50
"Mali",59
"Malta",178
"Marshall Islands",337
"Martinique",146
"Mauritania",36
"Mauritius",6501
"Mayotte",515
"Mexico",4426
"Micronesia",723
"Micronesia, Federated States of",740
"Moldova",9
"Monaco",30
"Mongolia",95
"Montenegro",20
"Montserrat",2
"Morocco",391
"Mozambique",449
"Myanmar",1954
"Namibia",736
"Nauru",264
"Nepal",1872
"Netherlands",1902
"Netherlands Antilles",37
"New Caledonia",41432
"New Zealand",830288
"Nicaragua",241
"Niger",70
"Nigeria",319
"Niue",1638
"Norfolk Island",12814
"North Korea",9
"Northern Mariana Islands",112
"Norway",1805
"Oman",888
"Pakistan",790
"Palau",853
"Palestina",20
"Palestinian Territory, Occupied",3
"Panama",1465
"Papua New Guinea",332203
"Paraguay",663
"Peru",1512
"Philippines",14946
"Pitcairn",6
"Pitcairn Islands",91
"Poland",3033
"Portugal",750
"Puerto Rico",417
"Qatar",17
"Republic of Congo",21
"Reunion",515
"Romania",764
"Russia",1205
"Russian Federation",722
"Rwanda",341
"Réunion",88
"Saint Barthélemy",1
"Saint Helena",35
"Saint Helena, Ascension and Tristan da Cunha",35
"Saint Kitts and Nevis",37
"Saint Lucia",17
"Saint Vincent and the Grenadines",48
"Samoa",3517
"Sao Tome and Principe",9
"Saudi Arabia",227
"Senegal",136
"Serbia",79
"Seychelles",1470
"Sierra Leone",136
"Singapore",2165
"Slovakia",160
"Slovenia",177
"Solomon Islands",22870
"Somalia",204
"South Africa",15858
"South Georgia and the South Sandwich Islands",923
"South Korea",224
"South Sudan",6
"Spain",3401
"Sri Lanka",5056
"Sudan",465
"Suriname",173
"Svalbard and Jan Mayen",33
"Swaziland",143
"Sweden",4470
"Switzerland",2323
"Syria",56
"Syrian Arab Republic",21
"Taiwan",1582
"Taiwan, Province of China",292
"Tajikistan",44
"Tanzania",928
"Tanzania, United Republic of",632
"Thailand",5622
"Timor-Leste",251
"Togo",16
"Tokelau",164
"Tonga",4279
"Trinidad and Tobago",525
"Tunisia",451
"Turkey",763
"Turkmenistan",56
"Tuvalu",361
"Uganda",510
"Ukraine",248
"United Arab Emirates",43
"United Kingdom",10651
"United States",48259
"United States Minor Outlying Islands",15
"Uruguay",495
"Uzbekistan",171
"Vanuatu",10230
"Venezuela",743
"Venezuela, Bolivarian Republic of",539
"Viet Nam",948
"Vietnam",1451
"Virgin Islands, U.S.",178
"Wallis and Futuna",50
"Western Sahara",22
"Yemen",144
"Zambia",276
"Zimbabwe",891
"��land",59
"",5478133
Loading