--- title: "Advanced search and citation of occurrences" author: - Hannah L. Owens - Cory Merow - Brian Maitner - Jamie M. Kass - Vijay Barve - Robert Guralnick date: "`r Sys.Date()`" output: rmarkdown::html_vignette: fig_caption: yes toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{Advanced search and citation of occurrences} \usepackage[utf8]{inputenc} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include=FALSE} library(ape) library(occCite) knitr::opts_chunk$set(echo = TRUE, error = TRUE) knitr::opts_knit$set(root.dir = system.file('extdata/', package='occCite')) ``` # Advanced features This vignette demonstrates more advanced features and customization available in `occCite`. We recommend you read `vignette("Simple.Rmd", package = "occCite")` first, if you have not already done so. ## Loading data from previous GBIF searches Querying GBIF can take quite a bit of time, especially for multiple species and/or well-known species. In this case, you may wish to access previously-downloaded data sets from your computer by specifying the general location of your downloaded `.zip` files. `occQuery` will crawl through your specified `GBIFDownloadDirectory` to collect all the `.zip` files contained in that folder and its subfolders. It will then import the most recent downloads that match your taxon list. These GBIF data will be appended to a BIEN search the same as if you do the simple real-time search (if you chose BIEN as well as GBIF), as was shown above. `checkPreviousGBIFDownload` is `TRUE` by default, but if `loadLocalGBIFDownload` is `TRUE`, `occQuery` will ignore `checkPreviousDownload`. It is also worth noting that `occCite` does not currently support mixed data download sources. That is, you cannot do GBIF queries for some taxa, download previously-prepared data sets for others, and load the rest from local data sets on your computer. ```{r simple_search, eval=F} # Simple search myOldOccCiteObject <- occQuery(x = "Protea cynaroides", datasources = c("gbif", "bien"), GBIFLogin = GBIFLogin, GBIFDownloadDirectory = system.file('extdata/', package='occCite'), checkPreviousGBIFDownload = T) ``` ```{r simple_search sssssecret cooking show, eval=T, echo = F} # Simple search data(myOccCiteObject) myOldOccCiteObject <- myOccCiteObject ``` Here is the result. Look familiar? ```{r simple_search_loaded_GBIF_results} #GBIF search results head(myOldOccCiteObject@occResults$`Protea cynaroides`$GBIF$OccurrenceTable); #The full summary summary(myOldOccCiteObject) ``` Getting citation data works the exact same way with previously-downloaded data as it does from a fresh data set. ```{r getting_citations_from_already-downloaded_GBIF_data} #Get citations myOldOccCitations <- occCitation(myOldOccCiteObject) print(myOldOccCitations) ``` Note that you can also load multiple species using either a vector of species names or a phylogeny (provided you have previously downloaded data for all of the species of interest), and you can load occurrences from non-GBIF data sources (e.g. BIEN) in the same query. *** ## Performing a Multi-Species Search In addition to doing a simple, single species search, you can also use `occCite` to search for and manage occurrence datasets for multiple species. You can either submit a vector of species names, or you can submit a *phylogeny*! The occCitation function will return a named list of citation tables in the case of multiple species. ## occCite with a Phylogeny Here is an example of how such a search is structured, using an unpublished phylogeny of billfishes. ```{r multispecies_search_with_phylogeny, eval=T, echo=T} library(ape) #Get tree treeFile <- system.file("extdata/Fish_12Tax_time_calibrated.tre", package='occCite') phylogeny <- ape::read.nexus(treeFile) tree <- ape::extract.clade(phylogeny, 22) #Query databases for names myPhyOccCiteObject <- studyTaxonList(x = tree, datasources = "GBIF Backbone Taxonomy") #Query GBIF for occurrence data myPhyOccCiteObject <- occQuery(x = myPhyOccCiteObject, datasources = "gbif", GBIFDownloadDirectory = system.file('extdata/', package='occCite'), loadLocalGBIFDownload = T, checkPreviousGBIFDownload = F) # What does a multispecies query look like? summary(myPhyOccCiteObject) ``` When you have results for multiple species, as in this case, you can also plot the summary figures either for the whole search... ```{r plotting all species, eval=T, message=FALSE, warning=FALSE, paged.print=FALSE, results='hide', fig.hold='hold', out.width="100%"} plot(myPhyOccCiteObject) ``` *or* you can plot the results by species! ```{r plotting phylogenetic search by species, eval=T, message=FALSE, warning=FALSE, paged.print=FALSE, results='hide', fig.hold='hold', out.width="100%"} plot(myPhyOccCiteObject, bySpecies = T, plotTypes = c("yearHistogram", "source")) ``` And then you can print out the citations, separated by species (or not, but in this example, they're separate). ```{r getting_citations_for_a_multispecies_search, echo=T} #Get citations myPhyOccCitations <- occCitation(myPhyOccCiteObject) #Print citations as text with accession dates. print(myPhyOccCitations, bySpecies = T) ```