Title: | Linked Micromap Plots for U. S. and Other Geographic Areas |
---|---|
Description: | Provides the users with the ability to quickly create linked micromap plots for a collection of geographic areas. Linked micromap plots are visualizations of geo-referenced data that link statistical graphics to an organized series of small maps or graphic images. The Help description contains examples of how to use the 'micromapST' function. Contained in this package are border group datasets to support creating linked micromap plots for the 50 U.S. states and District of Columbia (51 areas), the U. S. 20 Seer Registries, the 105 counties in the state of Kansas, the 62 counties of New York, the 24 counties of Maryland, the 29 counties of Utah, the 32 administrative areas in China, the 218 administrative areas in the UK and Ireland (for testing only), the 25 districts in the city of Seoul South Korea, and the 52 counties on the Africa continent. A border group dataset contains the boundaries related to the data level areas, a second layer boundaries, a top or third layer boundary, a parameter list of run options, and a cross indexing table between area names, abbreviations, numeric identification and alias matching strings for the specific geographic area. By specifying a border group, the package create linked micromap plots for any geographic region. The user can create and provide their own border group dataset for any area beyond the areas contained within the package. In version 3.0.0, the 'BuildBorderGroup' function was upgraded to not use the retiring 'maptools', 'rgdal', and 'rgeos' packages. References: Carr and Pickle, Chapman and Hall/CRC, Visualizing Data Patterns with Micromaps, CRC Press, 2010. Pickle, Pearson, and Carr (2015), micromapST: Exploring and Communicating Geospatial Patterns in US State Data., Journal of Statistical Software, 63(3), 1-25., <https://www.jstatsoft.org/v63/i03/>. Copyrighted 2013, 2014, 2015, 2016, 2022, 2023, and 2024 by Carr, Pearson and Pickle. |
Authors: | Jim Pearson [aut, cre, cph], Dan Carr [aut, cph], Linda Pickle [ctb, cph] |
Maintainer: | Jim Pearson <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.0.4 |
Built: | 2024-10-27 12:33:09 UTC |
Source: | CRAN |
The micromapST package provides a means of creating multiple column graphics representing data for a collection of geographic areas. The originally micromapST was limited to creating linked micromaps for each US 50 state and the District of Columbia. With this release, the package has been updated to be able to create linked micromaps for any collection of areas that is defined in a border group dataset (see details). Each area's graphical element is linked to a small map by means of color.
Package: | micromapST |
Type: | Package |
Version: | 3.0.4 |
Date: | 2024-10-26 |
License: | GPL-2 |
LazyLoad: | no |
The package uses the following R released packages: stringr, RColorBrewer, graphics, stats, grDevices, and labeling. When the package is download, these support packages should also be loaded. If not, please install these packages before loading micromapST.
Linked micromap plots link statistical graphs to an organized set of small maps, thus adding geographical context to the graphs and data. The micromapST package has been expanded to be able to use boundary data from any collection of geographic areas through the use of border group datasets. Each border group dataset contains five R objects that define the boundaries, run parameters and name, abbreviation, and ID relationships for a geographic region. When a border group is specified in the function call, the associated dataset is loaded and the five R objects become the key data structures used by micromapST to create the linked micromaps. The five R objects are:
areaParms - run parameters for this border group
areaNamesAbbrsIDs - the area name, abbreviation and numeric ID associates for each area.
areaVisBorders - the boundary data for each area, indexed by the area abbreviation
L2VisBorders - the boundary data for overlaying boundaries within the areas.
RegVisBorders - the boundary data for a region of areas.
L3VisBorders - the boundary data for the entire collection of areas.
The currently the supported sets of border groups contain in this package are:
USStatesBG - The 50 U.S. states and the District of Columbia
USSeerBG - The 20 U.S. Seer Areas within the U.S.
KansasBG - The 105 counties in the state of Kansas
MarylandBG - The 24 counties of Maryland
NewYorkBG - The 62 counties in the state of New York
UtahBG - The 29 counties of Utah
SeoulSKoreaBG - The 25 districts in the city of Seoul.
UKIrelandBG - The 218 counties in UK and Ireland, for testing only.
ChinaBG - The 32 provinces, municipalities, autonomous regions, and Special Administrative divisions of China.
AfricaBG - The 52 countries of the African continent.
The default border group is USStatesBG to allow users of previous releases of micromapST to run without changes to their code. To support users of the test release of micromapSEER, a function micromapSEER is included. The function micromapSEER sets up required parameters and then calls micromapST with the border group set to USSeerBG to create the same linked micromaps as the micromapSEER package created. All of the call parameters are the same. The two packages were merged to allow micromapSEER users to benefit from new features and fixes are released under micromapST.
Additional border groups will be added over time. The user may also create their own border
group for use with micromapST (see paper on creating micromapST border groups.)
The entire micromap is created to fits on a single page.
The page may be portrait or landscape and can range from an 8.5 x 11 up to a 11 to 17 page.
Areas are grouped into panels from 3 to 5
areas each based on the sort variable, with the median-valued area set off
in a separate panel in the middle of the page. If the median panel contains more than
1 area, a full link micromap panel is generated. Otherwise a single line representing
the area is drawn and the median area is highlighted in the panels above and below the median.
In the case of the U.S. Map and 51 states, there are 5 panels of 5 areas (states) above the
median and 5 panels of 5 areas (states) below the median row.
The U.S. Seer Registry data may be groups of 9, 11, 13, 17 or 18 registries. Number of registries per panel and number of panels are dynamically setup based on the number of registries involved in the micromap. There are a variety of glyphs the caller can specify for each column of the micromap: the US map with areas colored, a list of registry names or abbreviations (the default) and one or more statistical graphics. The order of these panels is specified by the caller.
The statistical glyphs implemented in this version are plots of dots, dots with significant, dots with confidence intervals, dots and intervals based on Standard Error, horizontal bars, arrows, time series with or without confidence bands, horizontal stacked (segmented (SEGBAR), normalized (NORMBAR) or centered (CTRBAR)) bar charts, scatter plots and boxplots. The layout of the linked micromap plot is defined by the panelDesc data.frame that is passed to the micromapST function.
If the micromap cannot fit on one page, warnings are generated and the function is stopped. It is suggested the caller increase the size of the page (graphic space) being used to compensate.
The U.S. map of states and areas used by the USStatesBG and USSeerBG border groups are generalized boundary map, based on Mark Monmonier's visibility map. These maps are simplified to maximize the color areas shown for each state and to minimize the length of the boundary lines while still allowing identification of each area. At some future time, all border groups should have their boundaries characterized to enhance the linked micromap's readability.
One of the biggest enhancements in this version of micromapST, is support for other geographic areas. This has been added using border groups. Each border group data set contains the unique collection of information, run parameters, names, abbreviations, and boundary information for the geographic region. The package contains some pre-made border groups and the package user is encourage gather the required information and create their own border group. This is not a small task and required boundary file manipulation to reduce the complexity of the boundaries and researching to identify a suitable list of names, abbreviations and ID for each sub-area within the desired geographical area. The author is working on a guideline and a step by step procedure to help the user create their own border groups. This release includes border group to re-create the original link micromaps in earlier versions of micromapST for the 50 U. S. States and DC and micromapSEER for the 20 NCI Seer Registries. Several other test versions of border groups have been included in this release. The complete list of border groups included are:
USStatesBG - the default border group to continue generating the original linked micromaps in earlier versions of micromapST.
USSeerBG - a border group to support creating linked micromaps for NCI. A function micromapSEER is included to support earlier coding. This function automatically pulls in the USSeerBG border group.
KansasBG - a border group to support the generation of linked micromaps for the 105 counties in the state of Kansas.
MarylandBG - a border group to support the generation of linked micromaps for the 23 counties and the city of Baltimore in the state of Maryland.
NewYorkBG - a border group to support the generation of linked micromaps for the 62 counties in the state of New York.
UtahBG - a border group to support the generation of linked micromaps for the 29 counties in the state of Utah.
ChinaBG - a border group to support the generation of linked micromaps for the 35 provinces, special administrative regions, and autonomous regions of China.
UKIrelandBG - a border group for testing a large number of sub-areas (counties) in multiple regions. The package currently supports up to 110. This border group is designed to help in testing and developing support for up to 218 sub areas (counties, etc.)
SeoulSKoreaBG - a border group to support the generation of linked micromaps for the 25 districts within the city of Seoul South Korea.
AfricaBG - a border group to support creation of linked micromaps for 52 countries of Africa. This border group is for a continent with the sub-areas being countries.
The following datasets have also been included for use in many examples:
Educ8thData - Education Data from an 8th grade survey.
TSdata - Sample time series data.
statePop2010 - State Population data for 2010.
wflung00and95 - White Female Lung cancer data for 2000 and 1995.
wflung00and95US - White Female Lung cancer data totals for 2000 and 1995.
wflung20cnty - White Female Lung cancer sample data by county for 2000.
wmlung5070 - White Male Lung cancer sample data for 1950 and 1970.
wmlung5070US - White Male Lung cancer sample data for 1950 and 1970 totals.
Refer to the chapter on each border group for definitions on the Names, Abbreviations, and IDs used in the border group to link the user data to the boundary information to draw the micromap maps.
The sort order of the rows (areas) is based one of the statistical data columns as specified by the user. Correlation between multiple statistical columns can be judged visually by comparing the pattern of one column's values from top to bottom of the page with that of the sorted column. Spatial clusters of states with similar values of the sorting variable can be identified on the small maps that are linked to the graphics by color.
A area linked micromap plot is generated by 4 steps:
# load the package library(micromapST) # Read, create or collect your data into a data.frame. statsDFrame <- data.frame(a row per area, column per variable to be plotted, row.names set to the area names or abbreviations) # now set up a data frame that defines the labels, # panel and page layout panelDesc<-data.frame(...) # specify the data source, panelDesc, sorting variable and # order, and call the stateMicromap function micromapST(statsDFrame, panelDesc, title=c("title1","title2"), details=list(options=values))
The package contains a set of examples of how to produce linked area micromaps. The datasets used in each example are provided to help you learn how to use micromapSEER.
As of release 2.0.1 of the package, a new function has been added. The BuildBorderGroup function provide assistance to the user in building their own border groups from shape file and a name table as needed for geographic areas not covered by areas included in the package. Review the documentation on the BuildBorderGroup function for more details.
There are four primary sources for boundary data:
U. S. Census Bureau for U. S. states and territories (https://www.census.gov/) (2021 - Cartographic Boundary Files) ;
GADM (Global Administrative area Database) for all of the countries and sub-divisions in the world (https://gadm.org) (V4.1, July 16, 2022 release, older versions are available.);
DIVA-GIS (https://diva-gis.org) (January 2012); and
local governments.
All of the boundary data (shape files or json format) are free for public use. Care must be taken to ensure your data matches the boundary areas available in each collection. Data location IDs and boundaries change over time as areas are split, merged, added, or eliminated. The BuildBorderGroup function tries to clean up and repair the boundary data as best as it can. Always inspect the boundary data and make sure its usable for your application and is valid.
With the announcement of the support retirement of maptools, rgeos, rgdal, and effectively sp and spdep, this package had to be upgraded to continue to be useful by it's users. V3.0.0 completes that enhancement.
The packaged is tuned to work with an area 7.5" wide and 10.5" high. Testing has shown it works well with portrait or landscape orientation and areas up to 11" x 17".
The examples in this package the output is directed to a PDF file for best
clarity and resolution. File types of SVG, PNG, JPEG or TIFF can also be used.
If the output is directed to a window, it is suggested a
windows( 7.5, 10.5, xpinch=72, ypinch=72, pointsize=9 )
command
is used to set up the window to best display the linked micromap.
The results will vary based on the resolution of the monitor being used.
Daniel B. Carr [email protected] and
James B Pearson, Jr [email protected],
with contributions from Linda Pickle
Maintainer: "James B. Pearson Jr." [email protected]
Package compiled by "James B, Pearson, Jr." [email protected]
Daniel B. Carr and Linda Williams Pickle, Visualizing Data Patterns with
Micromaps, CRC Press, 2010
Linda Williams Pickle, James B. Pearson Jr., Daniel B. Carr (2015),
micromapST: Exploring and Communicating Geospatial Patterns in US State Data.,
Journal of Statistical Software, 63(3), 1-25.,
https://www.jstatsoft.org/v63/i03/
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The AfricaBG border group dataset supports creating linked micromaps for the 52 countries (sub-areas) on the African continent. When the bordGrp call argument is set to AfricaBG, the appropriate name table (country names and abbreviations) and the 52 sub-areas (countries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the country's name, abbreviation, alternate abbreviation, or ID based on the table below.
data(AfricaBG)
data(AfricaBG)
The AfricaBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, and numerical identifier for the 59 countries in Africa.
- the boundary point lists for each country in Africa.
- the boundaries for an intermediate level and is not used in this border group and is set to L3VisBorders as a place holder.
- the boundaries for regions in Africa. In this implementation of the border group, no regions are specified. This data frame is not used and is set to L3VisBorders as a place holder.
- the boundary of the Africa continent.
The Africa continent border group contains 52 country sub-areas. Each country has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. No regions are defined in the Africa border group, so the L2VisBorders dataset is not used and the regions option is disabled. The L3VisBorders dataset contains the outline of the Africa continent.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (country) using the fullname, abbreviation, alternate abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (country) in the name table, the sub-area is mapped but not colored.
The following are a list of the names, abbreviations, alternate abbreviations and IDs for each country in the AfricaBG border group.
name | ab | alt_ab | id |
Algeria | ALG | DZ | 01 |
Angola | ANG | AO | 02 |
Benin | BEN | BJ | 03 |
Botswana | BOT | BW | 04 |
Burkina Faso | BUF | BF | 05 |
Burundi | BUR | BI | 06 |
Cameroon | CAM | CM | 07 |
Cape Verde | CAP | CV | 08 |
Central African Republic | CAR | CF | 09 |
Chad | CHA | TD | 10 |
Comoros | COM | KM | 11 |
Congo-Brazzaville | CNG | CG | 12 |
Cote d`Ivoire | CDI | CI | 13 |
Democratic Republic of Congo | ZAI | ZR | 14 |
Djibouti | DJI | DJ | 15 |
Egypt | EGY | EG | 16 |
Equatorial Guinea | EQG | GQ | 17 |
Eritrea | ERI | ER | 18 |
Ethiopia | ETH | ET | 19 |
Gabon | GAB | GA | 20 |
Gambia | GAM | GM | 21 |
Ghana | GHA | GH | 22 |
Guinea | GIN | GN | 23 |
Guinea-Bissau | GUB | GW | 24 |
Kenya | KEN | KE | 25 |
Lesotho | LES | LS | 26 |
Liberia | LIB | LR | 27 |
Libya | LAJ | LY | 28 |
Madagascar | MAD | MG | 29 |
Malawi | MAA | MW | 30 |
Mali | MAL | ML | 31 |
Mauritania | MAU | MR | 32 |
Morocco | MOR | MA | 33 |
Mozambique | MOZ | MZ | 34 |
Namibia | NAM | NA | 35 |
Niger | NIG | NE | 36 |
Nigeria | NIR | NG | 37 |
Rwanda | RWA | RW | 38 |
Sao Tome and Principe | STP | ST | 39 |
Senegal | SEN | SN | 40 |
Sierra Leone | SIL | SL | 41 |
Somalia | SOM | SO | 42 |
South Africa | SOU | SA | 43 |
Sudan | SUD | SD | 44 |
Swaziland | SWA | SZ | 45 |
Tanzania | TAN | TZ | 46 |
Togo | TOG | TG | 47 |
Tunisia | TUN | TN | 48 |
Uganda | UGA | UG | 49 |
Western Sahara | WES | WS | 50 |
Zambia | ZAM | ZM | 51 |
Zimbabwe | ZIM | ZW | 52 |
When this border group was created, there appeared to be no consistant set of abbreviations for the African countries. Therefore, the two most commonly found sets of abbreviations are included as the abbr and alt_abbr abbreviation sets. Set rownames to "ab" to reference the primary set and "alt_ab" to reference the second set of abbreviates in the name table.
The rowNames = "alias" and the regionB and dataRegionsOnly features are not supported in the AfricaBG border group.
This dataset contains the population and country data for the 52 countries in the African border group.
data(AfricaPopData)
data(AfricaPopData)
A data frame with 52 observations, 1 for each African country, on the following "x" variables.
an integer rank of the country in Africa.
a character vector containing the Africa Country Name.
a character vector containing the African Country Abbreviation.
a numeric vector of the number of the county's population
a numeric vector of the average relative population growth.
a numeric vector of the average absolute population growth.
a numeric vector of the estimated time to double the population - years.
a numeric vector of the official population.
the date the information was last updated.
a numeric vector representing the percentage the country's population is to the total population of Africa.
This dataset was pulled from wikipedica on the population numbers for African countries.
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
The micromapST function can be used to create linked micromaps for may different spatial areas by using different border groups. Several border group (bordGrp) examples are contained in the package and include the 51 states and DC of the United States, the counties of Kansas, Maryland, New York, Utah, the countries and provinces of the U.K. and China, and the U. S. Seer Registries used by the National Cancer Institute. Each border group is a different dataset containing the unique boundaries and operational information to allow micromapST to work in a different spatial area. The structure of each border group dataset is identical with the same variable names and types of structures. A user can build their own border group dataset to meet their specific spatial area needs. Because the package contains several border group datasets, the use of lazydata or lazyloading has been disabled.
The name of the border group is specified in the bordGrp call parameter. To permit a user to reference a border group dataset not contained in the package, and reside in a user's folder, the bordDir must be used to direct the package to the border group. The border group must be saved under R using the save function with the file extension of ".rda". For example: bordGrp="private", bordDir="c:/SavedBorderGroups"
Each border group contain six (6) datasets by the same data.frame names. This allows the micromapST package the ability to quickly load a particular border group and create the requested micromaps. The five data.frames are: areaParms, areaNamesAbbrsIDs, areaVisBorders, L2VisBorders, RegVisBorders, and L3VisBorders. Since the same data.frame names are reused in each border group, the R lazyload feature is disabled in the package.
The following describes the purpose and structure of each data.frame in the border group dataset:
- contains specific parameters and operational information for the border group
- containing the names (full text), name abbreviations, wildcard string for name matching, alternate name abbrevations, regional (Level 2) association of the sub-area, and a numerical identifier for the areas in the border group. If the border group is for a state, the areas would be the counties within the state. If the border group is for a continent, then the areas are the countries on the continent. When the border group is for a country like the United States, then the areas are the administrative areas (like states, provinces, special administrative areas, or cities) within that country.
- the boundary point lists for each area in the border group.
- when a border group needs to have an intermediate set of boundaries drawn for clearity, the border group can provide the package a layer 2 set of boundaries via the L2VisBorders data.frame. The structure consists of the boundary information (point lists) of these areas and associated keys. Each area is linked to it's L2 boundary via the L2_ID variable (column) in the name table (areaNamesAbbrsIDs data.frame. At this time only the U. S. States and U. S. Seer Registry border group uses the L2 boundary overlays. It uses L2VisBorders to draw the boundaries of U. S. States around Seer Registries within a state. The areaParms$Map.L2Borders variable in the border group must be set to TRUE for the package to draw the layer 2 boundaries. If a border group does not use an intermediate level outline, L2VisBorders should be set to L3VisBorders or NA and the areaParms$Map.L2Borders variable set to FALSE. In this case, no L2 boundaries are drawn.
- when the border group wants to work with the areas on a regional basis, the regID variable in the name table (areaNamesAbbrsIDs), the RegVisBorders boundary information data.frame, and the areaParms$aP_Regions are used to enable the feature and provide the information to map only regions of areas and overlay areas with regional boundaries highlights. When the regions call parameter is set to TRUE and the selected border group has the areaParms$aP_Regions set to TRUE, the package will scan the data provided by the caller and determine which regions are being referenced and which are not. The package uses the regID variable in the name table(areaNamesAbbrsIDs) to associate a area with a region and as a key into the RegVisBorders data.frame to draw the region boundaries. Any region not containing data and any L2 area and areas within those regions will not be drawn. If a border group does not use the regions feature the RegVisBorders data.frame in the border group should be set to the L3VisBorders data.frame or NA and the areaParms$aP_Regions variablel set to FALSE. This will disable the regions feature for the border group.
- the boundary point list for an outline of the entire geographica area referenced by the border group. For the U.S. or a country border group, this is the outline of the country. For a state border group like Kansas, this is an outline of the state. For smaller areas like Seoul, it is an outline of the city. The L3VisBorders data frame must be present in the border group.
The default border group is USStatesBG to be compatible with older R scripts using previous versions of the micromapST package.
This section provide the data frame structural details of each of the 6 data.frames and thier variables that make up a border group dataset.
The areaParms data.frame contains specific information about and in support of its border group. It provides column headers for the map and id glyphics and controls several features that related to handling a border group by the micromapST package. These controls are specific to a border group and do not changed from one micromapST call to the next and don't have to be part of the calling parameters.
For example, there are several built in titles and labels for the map and id glyphics. These will change for different border groups. For the USStatesBG border group, the map title is always "U.S. States", while in the USSeerBG border group the map glyphic title is "U.S. Seer Areas".
This data.frame contains the following variables that are used by micromapST when handling a border group.
The areaParms data.frame variables are:
- logical variable. If TRUE then the border group represents the entire US area and labels will be placed on the first map for Alaska, Hawaii, and DC. This is option is only used with the USStatesDF and USSeerBG border groups. For the state/county border groups and foreign country border groups, areaUSData must be set to FALSE. This variable is being retired.
- logical variable. If TRUE, enables the use of the "alias" name matching feature and the rowNames=alias call parameter. The "alias" name matching feature permits a partial ("contains") match of the loc id data in the rowNamesCol link column in the statsDFrame or the row.names of the statsDFrame data.frame. At the present time, the "alias" feature is only used in the USSeerBG bordGrp. It was implemented to be able to use directly use data generated by the SEER*Stat website and match the loc ids to the internal SEER*Stat registry names in the border group. This feature can be expanded when needed to other border groups if the "Alias" column in the name table is filled in with unique strings.
- logical variable. If TRUE, the package contains a valid
RegVisBorders data.frame and the name table
(areaNamesAbbrsIDs) contains the regID (or name) of a
region boundary associated with the area. If the caller set the
regions call parameter to TRUE, the package will
only draw areas in regions and the region boundaries if the region
contains areas with data. For examples: this allows you to
provide data for the west coast U. S. states and not have the
midwest, south, or northeastern states drawn. This feature is
available to all border groups, but is currently only used by the
USStatesBG, USSeerBG, UKIrelandBG and ChinaBG border groups. If set
to FALSE, the regions feature is disabled and all regional
information ignored.
The RegVisBorders should still exist, but should be set to
equal the L3VisBorders data.frame. The regions
call parameter will be ignored. As an example: In the USStatesBG
and USSeerBG border groups, the regions are set up using the
4 U. S. census regions of: West, South, Midwest, and NorthEast.
If only data for states in the NorthEast are passed to micromapST,
only the NorthEast region and its states will be mapped when
regions is set to TRUE. If regions is set
to FALSE then all of the boundaries for all of the U. S.
States and DC are drawn eventhough data was only provided for the
states in the northeast region. This feature also allows a border
group with a large number of sub-area, like the UK and Ireland
border group to be assembled and used on a regional basis with
fewer sub-area were the full border group is not really usable
with linked micromap presentations.
- character variable. 1st title for the id type glyphics column.
- character variable. 2nd title for the id type glyphics column.
- character variable. 1st title element for the map type glyphics column. This variable is not implemented in this release.
- character variable. 2nd title element for the map type glyphics column. This variable provides the type of areas in the map. This string should be kept to less than 12-16 characters.
- a numeric variable. The X/Y aspect ratio for the map borders in this border group. This is used to adjust the map glyphic to keep the area's aspect looking correct.
- a numeric variable. The minimum height of a group/row in inches
- the maximum height of a group/row in inches.
- a number representing the cex multiplier used on the text function when the map labels are drawn.
- a character vector of the name of the border group.
- logical variable. If TRUE, the L2VisBorder border overlay is drawn on the map glyphics. If FALSE, the L2VisBorders boundaries are not drawn. This option is currently only used by USSeerBG to draw state boundaries for states containing multiple Seer Registries.
- logical variable. If TRUE, the regions feature is enabled. When the feature is enabled, the RegVisBorders data.frame should be included in the border group, but it is not required. The key to the regional feature is the regID column in the name table (areaNamesAbbrsIDs) that identifies the region associated with the area and the regional boundaries in the RegVisBorders data.frame. If FALSE, the regions feature is disabled.
- a character vector containing the projection used to create the areaVisBorders, L2VisBorders, RegVisBorders, and L3VisBorders boundary point lists.
- a character vector containing the measurement units of the coordinates in the VisBorders boundary point lists.
- logical variable. If TRUE, the L2VisBorder boundary data is available to drawn on the map glyphics. If FALSE, the L2VisBorders boundaries are not available and are not drawn. This option is currently only used by the following border groups: USSeerBG.
- logical variable. If TRUE, the RegVisBorder boundary data is available in the border group and can be used to drawn a regional boundaryy overlay on the map glyphics. If FALSE, the RegVisBorders boundary data is not available and regional boundaries are not drawn. This set of boundaries are only only available in the following border groups: USStatesBG, USSeerBG, ChinaBG, and UKIrelandBG. The drawing of the regional boundaries is controlled by the regionB call parameter.
The areaNamesAbbrsIDs data.frame provides the linkages between the fullname, abbreviation, and numerical identifier to link the statsDFrame data to the boundaries for the county areas. It is also used to validate the incoming data to ensure the linkage can be established. Within the boundary files, the area abbreviation is the key linkage.
The areaNamesAbbrsIDs dataset provides a table to permit the translation of a numerical ID (e.g., FIPS codes), area abbreviatation (should be less than 4-6 characters), optional alternate abbreviation, and full area names in the micromapST function.
- character string of each area name. Used to link the areas to boundaries when rowNames = "full" is specified. Used as the represented name of the sub-area in "ID" glyphics columns when plotNames = "full" is spedified (default).
- the name abbreviation for each area. Should be 2 to 3 character, but no more than 6 characters. Used to link the sub-areas to boundaries when rowNames = "ab" is specified. Used as the represented name of the area in "ID" glyphics columns when plotNames = "ab" is specified (default).
- an alternate name abbreviation for each sub-area. Should be 2 to 3 characters, but no more than 6 characters. Most of the time this field is identical to the Abbr field. In some cases, multiple sets of authenticated abbreviations were found for the sub-areas in a continent or country. When this happened, the most common abbreviation should be placed in the Abbr field and the second abbreviation placed in the Alt_Abr field. This features allows the border group to be used by a wider audience. To access the Alt_Abr abbreviates, set the rowNames call argument/parameter to "alt_ab". The labels in the statsDFrame statistics data.frame will be matched against the alternate abbreviations.
- numerical identifier for each area. Used to link the data to boundary information when rowNames = id is specified. The row.names in the user provided data.frame or specificed rowNamesCol column must match the IDs in the name table. If no match a warning is generated and the data row ignored.
- a character string for each area used to do a wildcard match ("contains") with the rowName or rowNamesCol specified in the micromapST call when the USSeerBG border group is used. When a match is completed, the abbreviation is used to link the user's data to the boundary data. The alias match is done when rowNames is set to "alias". There should be only one match in the data for each sub-area alias. If more are found, an error is raised. This feature is only supported in the USSeerBG border group.
- is the level 2 identifier. Used to link the area to the L2VisBorders boundary data point data.frame. In the case of the USSeerBG border group, the L2 boundaries are state boundaries and the L2_ID value is the state 2 character abbreviation.
- is the full name L2 area.
- is the region identifier. Used to link the area to the RegVisBorders boundary data point data.frame. The USStatesBG and USSeerBG border groups use this field to like the sub-areas to the four (4) U. S. Census regions (Northeast, South, MidWest, and West) This association is used with the regions call parameter to determine which regions and their areas, etc. will be drawn when the caller provide data is mapped.
- is the full name of a region.
- is a character string used to link the name table to the boundary data in the VisBorders data.frames.
- when the initial border group is created, the Key, Name or the Abbr variables may not be able to provide a link to the original boundary data. When this happens, the border group creater must use an alternate "link" to tie the name table to the boundary data. The link field is used to accomplish this in the package provided border group building steps and functions. Once the border group is fully constructed, the Link field is not use.
These three columns have replaced the MapLabel column outlined below. The MapL column provides the label to be drawn at the coordinates provided for the area. Only a few areas should require labels. To many labels will make the map unreadable in a linked micromap. The MapX and MapY columns in the name table provide the x and y coordinates for the label on the map. The coordinates are in the same units as the boundary coordinates used.
There are four boundary dataset - The boundaries of the areas being micromapped (counties) are in areaVisBorders. The boundaries of an intermediate level (2) for general orientation are in L2VisBorders. The boundaries of regional areas for highlight overlays and regional only mapping and are in RegVisBorders. The boundaries of the an outline of the entire mapping area is in L3VisBorders.
These four data.frames contain the boundary point lists for the areas, regions and total map space. The L2VisBorders, RegVisBorders, and L3VisBorders data.frames are used to outline groups of areas, regions of the area and the entire space. The areaVisBorders data.frame contains the point lists for each sub-areas and are keyed to the name table (areaNamesAbbrsIDs).
The data structure for each of the following four boundary data.frames is:
seq Key x y hole
- a numerical sequence number of the boundary points in the data.frame.
- the Key field had different uses in each of the four VisBorders structured data.frames. In the areaVisBorders data.frame, the Key is the unique key for the sub-area as defined in the name table (areaNamesAbbrsIDs). All of the points with the same Key form one or more polygons and represent boundaries of a sub-area. This allows the package to locate the boundary data for a specific sub-area when needed.
In the RegVisBorders boundary data.frame, the Key is the region ID associated with the boundary points (polygons). One or more sub-areas can be linked to an region boundary via the regID variable in the name table. The USStatesBG, USSeerBG, UKIrelandBG,and ChinaBG border groups may use of the regions feature.
In the L3VisBorders, all of the Key field values area set to a name that represents the entire border group area. The Key field is not used in drawning the area's outline. The level 3 boundary outline data.frame is always drawn when the entire geographic area is mapped. If not all of the regions are being mapped, then the L3 boundary is not drawn.
- a numerical value for the x coordinates of a polygon. The end of the polygon is signaled by an x value of NA. The first point in the polygon does not have to be repeated. The polygon draw function used by micromapST will close the figure. An area may be composed of several polygons. Holes in areas are also represented by polygons and are associated with the sub-area.
- a numerical value for the y coordinates of a polygon. The end of the polygon is signaled by an y value of NA.
- a logical value. If TRUE, the associated polygon is a hole within the sub-area identified by the Key field. A hole polygon is always drawn using the maps background color. For this reason, sub-areas containing holes (lakes, rivers, or other sub-areas), must proceed any sub-area in the data.frame that it may overlay with the hole. For example, if one sub-area "A" is contained within sub- area "B", sub-area "B" must have a hole where sub-area "A" is located and must preceed sub-area "A" in the VisBorders data.frame. In this way, sub-area "B" and its hole are drawn before sub-area "A" preventing sub- area "B" hole from overlaying sub-area "A".
Hole are processed by re-drawing the hole area with the current background color. Therefore, any area with holes must be in the data.frame prior to any areas that may occupy the hole's space in the map.
The order of the area's boundaries in these files is very important to allow correct processing of the holes and any areas that overlay holes. Holes are not unpainted polygons, but polygons repainted back to the background color. The order should be areas with holes must preceed areas that overlay hole space. This is required to ensure an area is not over-painted by an area's hole
Each data.frame should be validated before using to make sure they are clean and will not generate errors when used by micromapST.
Each border group contains the same six data.frames using the same six names. This allows the micromapST package the ability to quickly load a particular border group and create the requested micromaps.
See the write up on each included border group for details on the specific content of their border group dataset and the list of sub-area names, abbreviations, and id that should be used to link the data to the boundary information.
The regions feature and RegVisBorder overlays are only supported in the following border groups:
USStatesBG USSeerBG UKIrelendBG ChinaBG
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
The package's micromapST function created linked micromaps for
any geographic region. The information related to the selected
regions is provided to the micromapST function via a bordergrp
dataset. This dataset contains all of the operational information, the
needed boundary dataset and a name table required for micromapST to
draw the linked micromaps for the desired region of the world or elsewhere.
The dataset must contain the general information data.frame (areaParms), the
area boundary data.frame (list of points for each area in the boundery) and
a name table that provide the the "Name", "Abbr", and "ID" location names for
each area in the boundary group. The Name Table also assists in providing
the L2 and Regional features. The border groups were originally created
manually, one by one.
The BuildBorderGroup function is a compilation
of the learning from building past border groups. The function
tries to provide a common foundation to address many of the unique
situations discovered building the border groups manually. However, there
are still some situations that required the builder's intervention in
order to produce a usable map for linked micromaps.
The BuildBorderGroup function accepts a shape file (ESRI format)
and a user built name table. The name table provides the location ids
(name, abbreviation or id) to allow the micromapST user ways to specify
the identity of individual area using one of several forms of identifier.
The forms used by the BuildBorderGroup and micromapST functions
are "Name", "Abbr" (abbreviation), and the numerical "ID" identifiers
that have been assigned and accepted over time for the area as the
primary location identifiers.
In two special cases micromapST has been extended to support an
"Alt_Abbr" and an "Alias" form of the location identifiers. The "Alt_abbr"
form was added to handle geographic area that have two sets of commonly
used abbreviations. The name table was expanded to contain both sets
of abbreviations and allows the user of the border group to select the
abbreviation that matches data frame location ids. The "Alias" identifier
was implement to handle a special case where the source of the statistical
data did not use the accepted "Name" or "Abbr" location ids, but used
a more generallized string. To keep the matching as flexiable
as possible, a wildcard matching alias was introduced. As each
identifier in the data is examined, it is compared against the alias
string with leading and traiing "*" matching character. As long as
the Alias strings are unique and do not appear in two data rows,
the matching provide the link between the data and the geographical
area. Over the past 10 years, the source program has been changed many
times, but the location id label matches has continued to work.
The user created name table also provides additional information
regional identification and sub-setting of the map, and area labeling.
During the border group building process, the name table
The Name Table also provides parameters to the BuildBorderGroup
function to make specific modifications to areas. The modifications to
areas include shifting it location, scaling it size, and rotating
the area. These modifiers were used to scale Alaska to 35
it's normal size and move it to below California to reduce the size
of the total map. Hawaii was also moved below the U. S. to reduce
the total size of the map.
The name table also allows the builder to assocate areas with regions
in their map and Level 2 border outlining. Check the micromapST
documentation on how the Level 2 and Region boundaries are to be used.
The most important task the name table does is help provide the linkage
between the rows in the user's data frames to the an area's collection
of polygons from the shapefile. The Name Table also provide multiple
location ids to allow the user to provide the full formal name ("Name"),
one of two abbreviates (if available), or the numerical ID
assigned to each area.
When the processing is completed, the user has a border group .RDA file
ready for use with micromapST with boundaries usable in the
small maps in linked micromaps and the name table documentation for
the border group.
The BuildBorderGroup function validates all of the call parameters, inspects the information provided by the builder in the name table, inspects the shape file provided, inspects the projections used or requested, ensures the +unit= parameter in the projection is set to meters, The function performs the following steps to construct a border group: may different spatial areas by using different border groups.
In Version 3.0.0 of the BuildBorderGroup function, the function was upgraded to retire and package functions from the rgdal, rgeos and maptools packages as of October 16th, 2023. The spatial functions are not done by the "sf" package.
The following describes the process of conconstructing a border group dataset:
1. Validate all of the calling parameters provided on the function call and provide targeted error and warning messages.
2. Read the shape file to gather initial shape file variables. The shape file can be read prior to the call to the BuildBorderGroup function, modified, and passed as a structure using the ShapeFile call parameter.
3. Read the name table file and verifies all required columns are present. The Name Table data can also be pre-read and passed to BuildBorderGroup function as a data.frame .
4. Validate the data in the name table columns: characters vs. numeric vs. NA; duplicate valids, and range of values.
5. Validate the geometry in the shape file data.
6. Simplify the shape file geometry to provide a caricatured map with minimal vectex, but maintaining the ability to recognize the areas. This generally reduces the size of the boundary information to between 1 using the rmapshaper package.
7. Preform a union the polygons in the shape file to organize polygons by their associated area. In SP, it would be collecting all polygon data for an area as a list of polygons as one group. In sf, it is collecting all of the polygons for an area and forming a multipolygon of the data. Since we have moved entirely to sf operation, this union is preformed by the aggregate.sf features in sf.
8. Match the name table rows (entries) with the area polygons in the spatial structure. Once the match is established, a key is assigned for the area and set in both the name table and spatial structure. The abbreviation id is normally used for the key, but if the abbreviation is not provided, a short key is created. Abbreviation for areas are not always available.
9. The projection of the map makes a difference in how usable the map in making linked micromap. It was desided to no use longitute/latitude in the final resulting map. Therefore, the user has three choices: specify a non-long/lat projection on the original shape file; specify final projection parameters in the "proj4" call parameter, or let the function construct a Albers Equal Area projection about the centroid of the map with the north and south latitudes half the distance from the middle to the north and south limits of the map. If there are any map labels requested, their location points are transformed. Any needed transformation is done at this time.
10. The areaParms data.frame table is constructed to permit restarting the build process and to save all of the other call parameters and operation parameters needed by micromapST. The MapHdrs, IDHdrs, MapMinH, MapMaxH, and LabelCex variables are saved in the areaParms data.frame.
11. The name table, areaParms, and the 4 boundary data.frame structures are check pointed to disk for possible manual editing. If manual editing of the shape file or name table is done, they result of the edits must be saved using the original file name and the file placed in the original check point directory. The check point files will be read when the BuildBorderGroup is call with the checkPointRestart parameter set to TRUE the name table directory provided is the same, and the check pointed files are located. Manual editing should be done very carefully.
12. On a "checkPointReStart=TRUE" the BuildBorderGroup function reloads all of the check pointed data.frame to pick up the Border Group building process from where it left off.
13. On either a restart or a normal run, the function gathers the boundary information for the area boundaries, layer 2 boundaries, regional boundaries, and a map outline boundary (Layer 3).
14. The name table becomes the areaNamesAbbrsIDs data.frame and any working columns not needed in the final name table are deleted.
15. The name table to aggregate the area boundaries to form the regional boundary spatial structure, the Level 2 spatial structure, and the Lever 3 spatial structure (outline of the entire set of areas.) This is done to make sure all of the layers of boundaries correctly overlay each other.
16. Each set of spatial images are converted into the micomapST boundary data.frame structures (not sp or sf) that are compatiable with the R polygon drawing function.
17. The final collection data.frames for the border group are: areaParms, areaNamesAbbrsID, areaVisBorders, L2VisBorders, RegVisBorders, and L3VisBorders data.frames are written out to a single .RDA dataset file as the completed border group dataset.
Each border group is a different dataset containing the unique boundaries and operational information to allow micromapST to work in a different spatial area. The structure of each border group dataset is identical with the same variable names and types of structures. A user can build their own border group dataset to meet their specific spatial area needs. Because the package contains several border group datasets each one using the same data.frame names and structure, the use of lazydata or lazyloading had to be disabled. This means the R system cannot preload the datasets and have them waiting for use, they would effectively overlay each other.
The name of the border group is specified in the bordGrp call parameter.
To permit a user to reference a border group dataset not contained
in the package, and reside in a user's folder, the bordDir must be
used to direct the package to the border group. The border group must be
saved under R using the save function with the file extension of
".rda".
For example: bordGrp="private", bordDir="c:/SavedBorderGroups"
Each border group contain six (6) datasets by the same data.frame names. This allows the micromapST package the ability to quickly load a particular border group and create the requested micromaps. The six data.frames are: areaParms, areaNamesAbbrsIDs, areaVisBorders, L2VisBorders, RegVisBorders, and L3VisBorders. Since the same data.frame names are reused in each border group, the R lazyload feature is disabled in the package.
Several border group bordGrp examples are contained in the package and include the 51 states and DC of the United States, the counties of Kansas, Maryland, New York, Utah, the countries and provinces of the U.K. and China, and the U. S. Seer Registries used by the National Cancer Institute.
The example in this section shows how to build the Kentucky County border group using a simple name table and the U. S. Census Bureau Kentucky county 2000 boundary data.
BuildBorderGroup(ShapeFile = NULL, # required ShapeFileDir = NULL, # defaults to NameTableDir # required if not the default value of "link" ShapeLinkName = NULL, NameTableFile = NULL, # required NameTableDir = NULL, # required # required if not the default value of "link" NameTableLink = NULL, BorderGroupName = NULL, # required # defaults to NameTableDir BorderGroupDir = NULL, MapHdr = NULL, # optional MapMinH = NULL, # optional MapMaxH = NULL, # optional # required, default is the BorderGroupName and "Areas" IDHdr = NULL, LabelCex = NULL, # optional # optional, but highly recommended ReducePC = 1.25, # percent value proj4 = NULL, # optional checkPointReStart = NULL, # optional debug = 0 # debug only )
BuildBorderGroup(ShapeFile = NULL, # required ShapeFileDir = NULL, # defaults to NameTableDir # required if not the default value of "link" ShapeLinkName = NULL, NameTableFile = NULL, # required NameTableDir = NULL, # required # required if not the default value of "link" NameTableLink = NULL, BorderGroupName = NULL, # required # defaults to NameTableDir BorderGroupDir = NULL, MapHdr = NULL, # optional MapMinH = NULL, # optional MapMaxH = NULL, # optional # required, default is the BorderGroupName and "Areas" IDHdr = NULL, LabelCex = NULL, # optional # optional, but highly recommended ReducePC = 1.25, # percent value proj4 = NULL, # optional checkPointReStart = NULL, # optional debug = 0 # debug only )
ShapeFile |
a character string of the name of the ERSI formated shape file. Only the main part of the shape file name should be provided. The .shp, .shx, .dbf extensions should be omitted. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ShapeFileDir |
a character string defining the path to the folder containing the ERSI shape file. If the ShapeFileDir parameter is not provided or empty, the Name Table Directory will be used. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ShapeLinkName |
a character string defining the name of the variable within the shape file @data slot to use to match the polygons in the shape file with the area's row in the Name Table. The default value is "__Link". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NameTableDir |
a character string defining the path to the folder containing the Name Table File. There is no default value for this parameter. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NameTableFile |
a character string defining the name of the excel spreadsheet file or a .csv file containing the user built Name Table columns and information. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NameTableLink |
a character string defining the Name Table column name that should be used to match the ShapeFileLink variable to link the Name Table to the area's collection of polygons. The default value is "link". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
BorderGroupName |
a character string to use as the border group's name and dataset name. If the string ends with a "BG", it will be striped and re-added when the border group is built. If the string does not end with "BG", "BG" will be added to designate the file is a border group. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
BorderGroupDir |
a character string defining the path to the folder where the border group dataset will be written at the end of the processing. If this parameter is not provided or "NA", the NameTableDir parameter will be used as the BorderGroupDir parameter. If the BorderGroupDir path does not exist, the BuildBorderGroup function will create the directory. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MapHdr |
is a two element character vector used to modify the
pre-defined map header labels in micromapST for map type glyphs.
The value is entered as |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MapMinH |
is a numerical variable specifying the minimum height the maps should be in the group/rows in a linked micromap graphic. The default value is 0.5. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MapMaxH |
is a numerical variable specifying the maximum height the maps should be in the group/rows in a linked micromap graphic. The default value is 1.75. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IDHdr |
is a two element character vector used to modify the id
glyph headers in micromapST. The two values are entered as
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LabelCex |
a numerical value indicating the cex multiplier to use when drawing the Map Labels (MapL) on the first map in a linked micromap graphic. The default value is 0.4 to match the micromapST maps. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ReducePC |
a numerical value between .01 and 100 tell the package to not reduce the number vertex in the shapefile. This is change from earlier releases where 0 to 1 could be used to represent 0 to 100 With the use of smaller keep values, the scale of the parameter can't be determined. Reduced percentage below 1 are common. Therefore, 0.65 is not 65 that be remaining in the shape file after simplification by rmapshaper. The default value is 1.25 to even 0.65 BuildBorderGroup to determine how this factor attects your boundary data. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
proj4 |
is a character string representing a projection using the Proj4 notation. The transformation to this projection is done as the last step in the processing of the shape file before converting the boundary data into the micromapST boundary data.frame. proj4 is provided, the projection in the original shape file is used. If the projection in the original shape file is missing or Long/Lat, then the function will create an Albers Equal Area protection centered on the centroid of the map's area. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
checkPointReStart |
The BuildBorderGroup function allows for the builder to manually adjust the shape file, just before building the micromapST boundary data.frame. During the normal process, the function writes check point images of the shape file, areaParms data.frame, and the Name Table data.frame to the "CheckPoint" folder in the NameTableDir folder. The builder can inspect and modify any of the tables and the shape file, but must be very careful with what and how they are modified. When done, the builder can re-issue the BuildBorderGroup function call with the checkPointReStart parameter set to TRUE. The function will bypass all of the processing up to the check point, then read in the check point files and continue building the border group. This will frequently be required when the map region contains many small area that will not be seen when shaped and simple scaling or shifting does not produce the desired arrangement of the areas. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
debug |
is a numerical value from 0 to 65536. It is used by the developers to turn on specific actions or information displays to aid in the debugging and troubleshooting this function. The developers assigned a single actions to each bit in a 16 bit integer. In this way, multiple actions can be requested without any posibility of them interferring with each other. For example: the value of 512 (b'00000010 00000000') is assigned to plotting the border group map at each of the 4 stages of processing the shapefile. The four stages are: 1) RAW, just read; 2) after being process by rmapshaper; 3) after the polygons are manipulated as specified in the Name Table; and finally 4) the boundaries after they are converted into the point table format (VisBorders) for use by micromapST. The output plots are saved as PDF files. All that is needed is to set debug=512. Another action the developers created is to save the plots created by the 512 action in PNG type files, not PDF. This can be done by setting debug = 512+128 = 640, in binary: b'00000010 00000000' OR b'00000000 10000000' = b'00000010 10000000'. The integer values are much easier to work with then the long 0s and 1s binary represention of the bits. Default is 0. The full table of assigned values is:
Using any of these debug options will greatly increase the size of the output generated by BuildBorderGroup and should not be used unless requested by the package developer. |
The output of this function is a single R dataset containing 6 data.frames and a text report for inclusion in documentation for the border group. The details on each of the 6 data.frames and their contents can be found in the bordGrp section.
The default border group is USStatesBG to be compatible with older R scripts using previous versions of the micromapST package.
Each of the border groups contained in this package have detail descriptions of the specific geographic area they represent. See the "xxxxx"BG section for each border group:
USStatesBG USSeerBG KansasBG MarylandBG NewYorkBG UtahBG UKIrelendBG ChinaBG SeoulSKoreaBG AfricaBG
Path to the saved Border Group file.
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
# Load libraries needed. stt1 <- Sys.time() library(stringr) library(readxl) library(sf) # Generate a Kentucky County Border Group # # Read the county boundary files. (Set up system directories. # Replace with your directories to run.) TempD<-"c:/projects/statnet/" # my private test PDF directory exist, #don't use temp. # get a temp directory for the output PDF files for the example. if (!dir.exists(TempD)) { TempD <- paste0(tempdir(),"/") DataD <- paste0(system.file("extdata",package="micromapST"),"/") } else { DataD <- "c:/projects/statnet/r code/micromapST-3.0.2/inst/extdata/" } cat("Temporary Directory:",TempD,"\n") # get working data directory #cat("Working Data Directory:",DataD,"\n") KYCoBG <- "KYCountyBG" # Border Group name KYCoCen <- "KY_County" # shape file name(s) KYCoShp <- st_read(DataD,KYCoCen) st_crs(KYCoShp) <- st_crs("+proj=lonlat +datum=NAD83 +ellipse=WGS84 +no_defs") # inspect name table KYNTname <- paste0(DataD,"/",KYCoCen,"_NameTable.xlsx") #cat("KYNTname:",KYNTname,"\n") KYCoNT <- as.data.frame(read_xlsx(KYNTname)) #head(KYCoNT) spt1 <- Sys.time() cat("Time to get data and boundaries for Counties:",spt1-stt1,"\n") ## Not run: # # building border group for all counties in Kentucky # stt2 <- Sys.time() # Build Border Group BuildBorderGroup(ShapeFile = KYCoShp, ShapeLinkName = "NAME", NameTableLink = "Name", NameTableDir = DataD, NameTableFile = paste0(KYCoCen,"_NameTable.xlsx"), BorderGroupName = KYCoBG, BorderGroupDir = TempD, MapHdr = c("","KY Counties"), IDHdr = c("KY Co."), ReducePC = 0.9 ) # Setup MicromapST graphic spt2 <- Sys.time() cat("Time to build KY Co BG:",spt2-stt2,"\n") stt3 <- spt2 KYCoData <- as.data.frame(read_xlsx(paste0(DataD,"/", "KY_County_Population_1900-2020.xlsx"))) #head(KYCoData) KY_Co_PD <- data.frame(stringsAsFactors=FALSE, type=c("map","id","dot","dot"), lab1=c(NA,NA,"2010 Pop","2020 Pop"), col1=c(NA,NA,"2010","2020") ) KYCoTitle <- c("Ez23ax-Kentucky County","Pop 2010 and 2020") OutCoPDF <- paste0(TempD,"Ez23ax-KY Co 2010-2020 Pop.pdf") grDevices::pdf(OutCoPDF,width=10,height=13) # on 11 x 14 paper. micromapST(KYCoData,KY_Co_PD,sortVar=c("2020"), ascend=FALSE, rowNames="full", rowNamesCol = c("Name"), bordDir = TempD, bordGrp = KYCoBG, title = KYCoTitle ) x <- dev.off() spt3 <- Sys.time() cat("Time to micromapST KY Co graph:",spt3-stt3,"\n") ## End(Not run) # end of dontrun. stt4 <- Sys.time() # Aggregate Kentucky Counties into ADD areas # # The regions in the Kentucky County Name Table (KYCoNT) are the ADD districts # the county was assigned to. # The KYCoShp has the county boundaries. # KYCoShp$NAME <- str_to_upper(KYCoShp$NAME) KYCoNT$NameCap <- str_to_upper(KYCoNT$Name) aggInx <- match(KYCoShp$NAME,KYCoNT$NameCap) #print(aggInx) xm <- is.na(aggInx) # which polygons did not match the name table? if (any(xm)) { cat("ERROR: One or more polygons/counties in the shape file did not match\n", "the entries in the KY County name table. They are:\n") LLMiss <- KYCoNT[xm,"Name"] print(LLMiss) stop() } # ##### # aggFUN - a function to inspect the data.frame columns and determine # an appropriate aggregation method - copy or sum. # aggFUN <- function(z) { ifelse (is.character(z[1]), z[1], sum(as.numeric(z))) } # ##### # aggList <- KYCoNT$regID[aggInx] #print(aggList) KYADDShp <- aggregate(KYCoShp, by=list(aggList), FUN = aggFUN) names(KYADDShp)[1] <- "regID" # change first column name to "regNames" row.names(KYADDShp) <- KYADDShp$regID KeepAttr <- c("regID","AREA","PERIMETER","STATE","geometry") KYADDShp <- KYADDShp[,KeepAttr] st_geometry(KYADDShp) <- st_cast(st_geometry(KYADDShp),"MULTIPOLYGON") #plot(st_geometry(KYADDShp)) spt4 <- Sys.time() cat("Time to aggregate KY ADDs from Cos:",spt4-stt4,"\n") stt5 <- spt4 # Build Border Group BuildBorderGroup(ShapeFile = KYADDShp, # sf structure of shapefile of combined counties into AD Districts ShapeLinkName = "regID", NameTableFile = "KY_ADD_NameTable.xlsx", NameTableDir = DataD, NameTableLink = "Index", BorderGroupName = "KYADDBG", BorderGroupDir = TempD, MapHdr = c("","KY ADDs"), IDHdr = c("KY ADDs"), ReducePC = 0.9 ) spt5 <- Sys.time() cat("Time to build ADD BG:",spt5-stt5,"\n") stt6 <- spt5 # Test micromapST KYADDData <- as.data.frame(readxl::read_xlsx( paste0(DataD,"KY_ADD_Population-2020.xlsx")), stringsAsFactors=FALSE) # KY_ADD_PD <- data.frame(stringsAsFactors=FALSE, type=c("map","id","dot","dot"), lab1=c(NA,NA,"Pop","Proj. Pop"), lab2=c(NA,NA,"2020","2030"), col1=c(NA,NA,"DecC2020","Proj2030") ) # KyTitle <- c("Ez23cx-KY Area Development Dist.", "Pop 2020 and proj Pop 2023") OutPDF2 <- paste0(TempD,"Ez23cx-KY ADD Pop.pdf") grDevices::pdf(OutPDF2,width=10,height=7.5) micromapST(KYADDData,KY_ADD_PD,sortVar="DecC2020",ascend=FALSE, rowNames= "full", rowNamesCol = "ADD_Name", bordDir = TempD, bordGrp = "KYADDBG", title = KyTitle ) x <- grDevices::dev.off() spt6 <- Sys.time() cat("Time to do micromapST of KY ADDs:",spt6-stt6,"\n")
# Load libraries needed. stt1 <- Sys.time() library(stringr) library(readxl) library(sf) # Generate a Kentucky County Border Group # # Read the county boundary files. (Set up system directories. # Replace with your directories to run.) TempD<-"c:/projects/statnet/" # my private test PDF directory exist, #don't use temp. # get a temp directory for the output PDF files for the example. if (!dir.exists(TempD)) { TempD <- paste0(tempdir(),"/") DataD <- paste0(system.file("extdata",package="micromapST"),"/") } else { DataD <- "c:/projects/statnet/r code/micromapST-3.0.2/inst/extdata/" } cat("Temporary Directory:",TempD,"\n") # get working data directory #cat("Working Data Directory:",DataD,"\n") KYCoBG <- "KYCountyBG" # Border Group name KYCoCen <- "KY_County" # shape file name(s) KYCoShp <- st_read(DataD,KYCoCen) st_crs(KYCoShp) <- st_crs("+proj=lonlat +datum=NAD83 +ellipse=WGS84 +no_defs") # inspect name table KYNTname <- paste0(DataD,"/",KYCoCen,"_NameTable.xlsx") #cat("KYNTname:",KYNTname,"\n") KYCoNT <- as.data.frame(read_xlsx(KYNTname)) #head(KYCoNT) spt1 <- Sys.time() cat("Time to get data and boundaries for Counties:",spt1-stt1,"\n") ## Not run: # # building border group for all counties in Kentucky # stt2 <- Sys.time() # Build Border Group BuildBorderGroup(ShapeFile = KYCoShp, ShapeLinkName = "NAME", NameTableLink = "Name", NameTableDir = DataD, NameTableFile = paste0(KYCoCen,"_NameTable.xlsx"), BorderGroupName = KYCoBG, BorderGroupDir = TempD, MapHdr = c("","KY Counties"), IDHdr = c("KY Co."), ReducePC = 0.9 ) # Setup MicromapST graphic spt2 <- Sys.time() cat("Time to build KY Co BG:",spt2-stt2,"\n") stt3 <- spt2 KYCoData <- as.data.frame(read_xlsx(paste0(DataD,"/", "KY_County_Population_1900-2020.xlsx"))) #head(KYCoData) KY_Co_PD <- data.frame(stringsAsFactors=FALSE, type=c("map","id","dot","dot"), lab1=c(NA,NA,"2010 Pop","2020 Pop"), col1=c(NA,NA,"2010","2020") ) KYCoTitle <- c("Ez23ax-Kentucky County","Pop 2010 and 2020") OutCoPDF <- paste0(TempD,"Ez23ax-KY Co 2010-2020 Pop.pdf") grDevices::pdf(OutCoPDF,width=10,height=13) # on 11 x 14 paper. micromapST(KYCoData,KY_Co_PD,sortVar=c("2020"), ascend=FALSE, rowNames="full", rowNamesCol = c("Name"), bordDir = TempD, bordGrp = KYCoBG, title = KYCoTitle ) x <- dev.off() spt3 <- Sys.time() cat("Time to micromapST KY Co graph:",spt3-stt3,"\n") ## End(Not run) # end of dontrun. stt4 <- Sys.time() # Aggregate Kentucky Counties into ADD areas # # The regions in the Kentucky County Name Table (KYCoNT) are the ADD districts # the county was assigned to. # The KYCoShp has the county boundaries. # KYCoShp$NAME <- str_to_upper(KYCoShp$NAME) KYCoNT$NameCap <- str_to_upper(KYCoNT$Name) aggInx <- match(KYCoShp$NAME,KYCoNT$NameCap) #print(aggInx) xm <- is.na(aggInx) # which polygons did not match the name table? if (any(xm)) { cat("ERROR: One or more polygons/counties in the shape file did not match\n", "the entries in the KY County name table. They are:\n") LLMiss <- KYCoNT[xm,"Name"] print(LLMiss) stop() } # ##### # aggFUN - a function to inspect the data.frame columns and determine # an appropriate aggregation method - copy or sum. # aggFUN <- function(z) { ifelse (is.character(z[1]), z[1], sum(as.numeric(z))) } # ##### # aggList <- KYCoNT$regID[aggInx] #print(aggList) KYADDShp <- aggregate(KYCoShp, by=list(aggList), FUN = aggFUN) names(KYADDShp)[1] <- "regID" # change first column name to "regNames" row.names(KYADDShp) <- KYADDShp$regID KeepAttr <- c("regID","AREA","PERIMETER","STATE","geometry") KYADDShp <- KYADDShp[,KeepAttr] st_geometry(KYADDShp) <- st_cast(st_geometry(KYADDShp),"MULTIPOLYGON") #plot(st_geometry(KYADDShp)) spt4 <- Sys.time() cat("Time to aggregate KY ADDs from Cos:",spt4-stt4,"\n") stt5 <- spt4 # Build Border Group BuildBorderGroup(ShapeFile = KYADDShp, # sf structure of shapefile of combined counties into AD Districts ShapeLinkName = "regID", NameTableFile = "KY_ADD_NameTable.xlsx", NameTableDir = DataD, NameTableLink = "Index", BorderGroupName = "KYADDBG", BorderGroupDir = TempD, MapHdr = c("","KY ADDs"), IDHdr = c("KY ADDs"), ReducePC = 0.9 ) spt5 <- Sys.time() cat("Time to build ADD BG:",spt5-stt5,"\n") stt6 <- spt5 # Test micromapST KYADDData <- as.data.frame(readxl::read_xlsx( paste0(DataD,"KY_ADD_Population-2020.xlsx")), stringsAsFactors=FALSE) # KY_ADD_PD <- data.frame(stringsAsFactors=FALSE, type=c("map","id","dot","dot"), lab1=c(NA,NA,"Pop","Proj. Pop"), lab2=c(NA,NA,"2020","2030"), col1=c(NA,NA,"DecC2020","Proj2030") ) # KyTitle <- c("Ez23cx-KY Area Development Dist.", "Pop 2020 and proj Pop 2023") OutPDF2 <- paste0(TempD,"Ez23cx-KY ADD Pop.pdf") grDevices::pdf(OutPDF2,width=10,height=7.5) micromapST(KYADDData,KY_ADD_PD,sortVar="DecC2020",ascend=FALSE, rowNames= "full", rowNamesCol = "ADD_Name", bordDir = TempD, bordGrp = "KYADDBG", title = KyTitle ) x <- grDevices::dev.off() spt6 <- Sys.time() cat("Time to do micromapST of KY ADDs:",spt6-stt6,"\n")
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The ChinaBG border group dataset supports creating linked micromaps for the 34 provinces, special administrative regions, metropolitan areas in the China. When the bordGrp call argument is set to ChinaBG, the appropriate name table (sub area names and abbreviations) and the 34 sub-areas (provinces, SAR, cities, etc.) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the county's name, abbreviated name or ID based on the table below.
data(ChinaBG)
data(ChinaBG)
The ChinaBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, and numerical identifier for the providences and municipalities of China.
- the boundary point lists for each area in China.
- the boundaries for an intermediate level. This level is not used in this border group.
- the boundaries for an regional level for China. This set of boundaries are used in conjunction with the regions call parameter.
- the boundary of the country of China.
For the China border group, there are 34 county sub-areas listed in the areaNamesAbbrsIDs and areaVisBorders datasets. The L2VisBorders dataset is not used and is set to the L3VisBorders dataset as a placeholder. The RegVisBorders dataset represents the 6 regions of China in the ChinaBG border group. The L3VisBorders dataset contains the outline of the country of China.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area using the fullname, abbreviation, alternate abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument. A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (county) in the name table, the sub-area is mapped but not colored.
The following are a list of the names, abbreviations, alternate abbreviations, ids, and the region for each county, province or metro area in the ChinaBG border group.
name | ab | alt_ab | id | region |
Anhui | AH | CN.AH | 34 | Huadong |
Beijing | BJ | CN.BJ | 11 | Huabei |
Chongqing | CQ | CN.CQ | 50 | Xinan |
Fujian | FJ | CN.FJ | 35 | Huadong |
Gansu | GS | CN.GS | 62 | Xibei |
Guangdong | GD | CN.GD | 44 | Zhongnan |
Guangxi | GX | CN.GX | 45 | Zhongnan |
Guizhou | GZ | CN.GZ | 52 | Xinan |
Hainan | HI | CN.HA | 46 | Zhongnan |
Hebei | HE | CN.HB | 13 | Huabei |
Heilongjiang | HL | CN.HL | 23 | Dongbei |
Henan | HA | CN.HE | 41 | Zhongnan |
Hubei | HB | CN.HU | 42 | Zhongnan |
Hunan | HN | CN.HN | 43 | Zhongnan |
Jiangsu | JS | CN.JS | 32 | Huadong |
Jiangxi | JX | CN.JX | 36 | Huadong |
Jilin | JL | CN.JL | 22 | Dongbei |
Liaoning | LN | CN.LN | 21 | Dongbei |
Nei Mongol | NM | CN.NM | 15 | Huadong |
Ningxia Hui | NX | CN.NX | 64 | Xibei |
Qinghai | QH | CN.QH | 63 | Xibei |
Shaanxi | SN | CN.SA | 61 | Xibei |
Shandong | SD | CN.SD | 37 | Huadong |
Shanghai | SH | CN.SH | 31 | Huadong |
Shanxi | SX | CN.SX | 14 | Huabei |
Sichuan | SC | CN.SC | 51 | Xinan |
Tianjin | TJ | CN.TJ | 12 | Huabei |
Xinjiang Uygur | XJ | CN.XJ | 65 | Xibei |
Xizang | XZ | CN.XZ | 54 | Xinan |
Yunnan | YN | CN.YN | 53 | Xinan |
Zhejiang | ZJ | CN.ZJ | 33 | Buadong |
Hong Kong | HK | CN.HK | 81 | Zhongnan |
Macao | MC | CN.MC | 82 | Zhongnan |
Taiwan | TW | CN.TW | 71 | Huadong |
The ChinaBG supports two sets of abbreviations for each county (province or metro area). No consistant source was found when the border group was originally created. Therefore, the two most common sets are included. The first abbreviation set can be referenced by setting the rowNames call argument to "ab". The second (alternate) abbreviation set can be used by setting the rowNames to "alt_ab".
The rowNames = "alias" features are not supported in the ChinaBG border group.
This dataset contains the 2014 population and average income per person in each of the China Areas
data(cnPopData)
data(cnPopData)
A data frame with 18 observations, 1 for each Seer area, on the following 12 variables.
a character vector containing the China area full names.
a numeric vector of the number of the county's population
This dataset was pulled from the China ??? government website on Dec. 2014. It contains the population and average income per person for each of Kansas' 105 counties.
An internal table containing a list of the details variables and parameters to do validation of user provided variable overrides of the defaults. The table also contains the information needed to translate the existing (as of 9/17/2015) details structure into a new details structure segmented by each type of glyphic.
data(detailsVariables)
data(detailsVariables)
A data frame with 6 variables:
a character string of the exact details variable name
a character string describing the type of test required to validate the variable. The supported tests are: colors, integer, numeric, logical, and vector3
first variable for the "test" for this variable.
second variable (if needed) for the "test" for this variable.
a vector of character string listing
the glyphics that use this variable.
For Example:
c('ts','tsconf')
the new variable name to be used within a glyphic. The glyphic name and variable name must be unique. This eliminates having to include the glyphic name in the variable name.
defines the range of the dependency for the dependent relationship.
indicates this variable is dependent on another variable and it's name.
the default values or the variable.
operational comments on the variable.
This dataset provide a table to micromapST for verification
of the details variables provided by a user to override the
packages default values. The varName is the exact name of
the variable. If the variable name provide by the user does not
match this list, it is flagged as an error and ignored.
The test supported are: colors, integer, numeric, logical and
vector3. The colors test calls the is.Color
function. The
integer and numeric tests use the range provided in v1 and
v2 columns to check the range of the value for the
variable. The logical test verifies the value is TRUE or FALSE.
The vector3 test check to make sure the value is a vector or
length 3 and each value is within the range in v1 to
v2.
The micromapST package is being updated to use a new variable structure. The new structure will allow options/parameters to be specified on glyphic and column basis. This allows each column to be uniquely tuned to the user requirements. The variable names have also been simplifed to use the same name across glyphics when the purpose is the same. This should reduce the users learning curve.
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
Math Proficiency Survey Results for 8th Graders in 2011 by State
data(Educ8thData)
data(Educ8thData)
A data frame with 51 observations (one per state) on the following 7 variables.
StAbbrev
a character string representing the 2 character state Id for this row.
State
a character string factor of the state full name.
avgscore
a numeric vector of average proficiency score for each state
PctBelowBasic
a numeric vector of percentage of students with below basic scores
PctAtBasic
a numeric vector of percentage of students at basic proficiency.
PctProficient
a numeric vector of percentage of students at the proficient level
PctAdvanced
a numeric vector of percentage of students scoring at the advanced level.
The dataset contains 51 records, one for each state/area. The data represents the percentage of 8th grade students in 2011 in that state who tested at each proficiency level in math: less than basic, basic, proficient and advanced. The row name is the state abbreviation - 2 characters. This dataset is used by the micromapSEER examples using the "USStatesDF" border group.
The National Center for Education Statistics, Department of Education
http://nces.ed.gov/nationsreportcard/states/
The micromapST function has the ability to generate linked micromaps for
any geographical area. To specify the geographical area, the bordGrp
call argument is used to specify the border group dataset for the geographical area.
When micromapST function is used to micromap data for Kansas County area, the
border group option (bordGrp)
is set to "KansasBG". This instructs micromapST to load the area Name, Abbreviation,
ID and boundaries files for Kansas 105 counties.
The datasets contained in the border group are areaNamesAbbrsIDs, areaVisBorders,
L2VisBorders, and
L3VisBorders for the counties of the state Kansas.
The user's data is then linked to the boundary data via the county's name,
abbreviated name or ID based on the table below.
data(KansasBG)
data(KansasBG)
The border group contains the following data.frames::
- contains specific parameters for the border group
- containing the names, abbreviations, and numerical identifier for the counties in the state of Kansas.
- the boundary point lists for each county area in Kansas.
- the boundaries for an intermediate level. For state areas, this boundary data is not used and set to L3VisBorders as a place holder.
- the boundaries for an intermediate level. For state areas, this boundary data is not used and set to L3VisBorders as a place holder.
- the boundary of the state of Kansas
The Kansas county border group contains 105 county sub-areas. Each county has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. No regions are defined in the Kansas county border group, so the L2VisBorders and RegVisBorders datasets are not used and the regions feations is disabled. The L3VisBorders dataset contains the outline of the state of Kansas.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (county) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call parameter.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (county) in the name table, the sub-area (county) is mapped but not colored.
The following are a list of the names, abbreviations, and IDs for each country in the KansasBG border group.
name | ab | id |
Allen | AL | 20001 |
Anderson | AN | 20003 |
Atchison | AT | 20005 |
Barber | BA | 20007 |
Barton | BT | 20009 |
Bourbon | BB | 20011 |
Brown | BR | 20013 |
Butler | BU | 20015 |
Chase | CS | 20017 |
Chautauqua | CQ | 20019 |
Cherokee | CK | 20021 |
Cheyenne | CN | 20023 |
Clark | CA | 20025 |
Clay | CY | 20027 |
Cloud | CD | 20029 |
Coffey | CF | 20031 |
Comanche | CM | 20033 |
Cowley | CL | 20035 |
Crawford | CR | 20037 |
Decatur | DC | 20039 |
Dickinson | DK | 20041 |
Doniphan | DP | 20043 |
Douglas | DG | 20045 |
Edwards | ED | 20047 |
Elk | EK | 20049 |
Ellis | EL | 20051 |
Ellsworth | EW | 20053 |
Finney | FI | 20055 |
Ford | FO | 20057 |
Franklin | FR | 20059 |
Geary | GE | 20061 |
Gove | GO | 20063 |
Graham | GH | 20065 |
Grant | GT | 20067 |
Gray | GY | 20069 |
Greeley | GL | 20071 |
Greenwood | GW | 20073 |
Hamilton | HM | 20075 |
Harper | HP | 20077 |
Harvey | HV | 20079 |
Haskell | HS | 20081 |
Hodgeman | HG | 20083 |
Jackson | JA | 20085 |
Jefferson | JF | 20087 |
Jewell | JW | 20089 |
Johnson | JO | 20091 |
Kearny | KE | 20093 |
Kingman | KM | 20095 |
Kiowa | KW | 20097 |
Labette | LB | 20099 |
Lane | LE | 20101 |
Leavenworth | LV | 20103 |
Lincoln | LC | 20105 |
Linn | LN | 20107 |
Logan | LG | 20109 |
Lyon | LY | 20111 |
Marion | MN | 20115 |
Marshall | MS | 20117 |
McPherson | MP | 20113 |
Meade | ME | 20119 |
Miami | MI | 20121 |
Mitchell | MC | 20123 |
Montgomery | MG | 20125 |
Morris | MR | 20127 |
Morton | MT | 20129 |
Nemaha | NM | 20131 |
Neosho | NO | 20133 |
Ness | NS | 20135 |
Norton | NT | 20137 |
Osage | OS | 20139 |
Osborne | OB | 20141 |
Ottawa | OT | 20143 |
Pawnee | PN | 20145 |
Phillips | PL | 20147 |
Pottawatomie | PT | 20149 |
Pratt | PR | 20151 |
Rawlins | RA | 20153 |
Reno | RN | 20155 |
Republic | RP | 20157 |
Rice | RC | 20159 |
Riley | RL | 20161 |
Rooks | RO | 20163 |
Rush | RH | 20165 |
Russell | RS | 20167 |
Saline | SA | 20169 |
Scott | SC | 20171 |
Sedgwick | SG | 20173 |
Seward | SW | 20175 |
Shawnee | SN | 20177 |
Sheridan | SD | 20179 |
Sherman | SH | 20181 |
Smith | SM | 20183 |
Stafford | SF | 20185 |
Stanton | ST | 20187 |
Stevens | SV | 20189 |
Sumner | SU | 20191 |
Thomas | TH | 20193 |
Trego | TR | 20195 |
Wabaunsee | WB | 20197 |
Wallace | WA | 20199 |
Washington | WS | 20201 |
Wichita | WH | 20203 |
Wilson | WL | 20205 |
Woodson | WO | 20207 |
Wyandotte | WY | 20209 |
There are no alternate abbreviations or regions assocated with counties in Kansas.
The id field value is the U. S. state and county FIPS code.
The rowNames = "alias" or "alt_ab" and the regionB and dataRegionsOnly features are not supported in the KansasBG border group.
This dataset contains the 2014 population and average income per person in each of the 105 Kansas counties.
data(KansPopInc)
data(KansPopInc)
A data frame with 18 observations, 1 for each Seer area, on the following 12 variables.
a character vector containing the Kansas County Name.
a numeric vector of the number of the county's population
a numeric vector of the average person's income for the county.
This dataset was pulled from the Kansas government website on Dec. 2014. It contains the population and average income per person for each of Kansas' 105 counties.
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The MarylandBG border group dataset supports creating linked micromaps for the 24 counties in the state of Maryland. When the bordGrp call argument is set to MarylandBG, the appropriate name table (county names and abbreviations) and the 24 sub-areas (countries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the county's name, abbreviated or ID based on the table below.
data(MarylandBG)
data(MarylandBG)
The MarylandBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, and numerical identifier for the counties in the state of Maryland.
- the boundary point lists for each county area in Maryland.
- the boundaries for an intermediate level. For this border group, this boundary data is not used and set to L3VisBorders as a place holder.
- the boundaries for an intermediate level. For this border group, this boundary data is not used and set to L3VisBorders as a place holder.
- the boundary of the state of Maryland
The Maryland county border group contains 24 county sub-areas. Each county has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. No regions are defined in the Utah county border group, so the L2VisBorders and RegVisBorders data.frames is not used and the dataRegionsOnly call parameter are is disabled. The L3VisBorders dataset contains the outline of the state of Maryland.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (county) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (county) in the name table, the sub-area is mapped but not colored.
The following are a list of the names, abbreviations, and ids for each country in the MarylandBG border group.
name | ab | id |
Allegany | AL | 24001 |
Anne Arundel | AA | 24003 |
Baltimore | BL | 24005 |
Baltimore City | BC | 24510 |
Calvert | CV | 24009 |
Caroline | CL | 24011 |
Carroll | CR | 24013 |
Cecil | CC | 24015 |
Charles | CH | 24017 |
Dorchester | DR | 24019 |
Frederick | FR | 24021 |
Garrett | GR | 24023 |
Harford | HR | 24025 |
Howard | HW | 24027 |
Kent | KN | 24029 |
Montgomery | MG | 24031 |
Prince George's | PG | 24033 |
Queen Anne's | QA | 24035 |
St. Mary's | SM | 24039 |
Somerset | SS | 24037 |
Talbot | TB | 24041 |
Washington | WH | 24043 |
Wicomico | WC | 24045 |
Worcester | WR | 24047 |
There are no alternate abbreviations or regions assocated with counties in Maryland border group.
The id field value is the U. S. state and county FIPS code.
The rowNames = "alias" or "alt_ab" and the regionB and dataRegionsOnly call parameters are not supported in the MarylandBG border group.
This dataset contains the county populations for 1970, 1980, 1990, 2000, and 2010 and the projected estimated populations for 2015, 2020, 2025, 2030, 2035 and 2040 for each Maryland county and Baltimore City.
data(mdPopData)
data(mdPopData)
A data frame with 24 rows, 1 for each county, with 12 variables for each county:
a character vector abbreviation for the county.
a character vector containing the Maryland county name and Baltimore City.
a numeric vector of the county's population in 1970.
a numeric vector of the county's population in 1980.
a numeric vector of the county's population in 1990.
a numeric vector of the county's population in 2000.
a numeric vector of the county's population in 2010.
a numeric vector of the county's estimated population in 2015.
a numeric vector of the county's estimated population in 2020.
a numeric vector of the county's estimated population in 2025.
a numeric vector of the county's estimated population in 2030.
a numeric vector of the county's estimated population in 2035.
a numeric vector of the county's estimated population in 2000.
.
This dataset was pulled from the Maryland government website in January, 2015. It contains the population and estimated population for each of Maryland counties and Baltimore City for 1970, 1980, 1990, 2000, and 2010 and estimated populations for years 2015, 2020, 2025, 2030, 2035, and 2040.
The BuildBorderGroup function verifies the call parameters and the incoming data. It then produces the data structures required by micromapST to draw maps for custom geographic areas. The user must provide a shape file and a name table for the function to product the user's custom border group dataset. This chapter documents the error and warning messages created by the BuildBorderGroup function when a problem or error is discovered before R has a chance to abort the execution of the function.
Like micromapST messages, BuildBorderGroup messages all start with "***" to help quickly find them in the warnings() logs and general output.
The general format of the messages is:
***XXXX text of message
where the XXXX is the unque four alphanumeric message identifier of the message. This chapter attempts to provide insight into the cause of the message and possible ways to resolve the problem.
Messages with numbers ranging from 3000 to 3999.
Conventions: The user provides the shape file and the name table to the function. The messages attempt to provide specific information on the problem detected. So, parts of the message will be modified to provide more specific information. In the messages below <name> type fields are used to identify the variable information in the messages.
Any values in the message that are replaced by more specific information are shown as <namefield>. The explanation for the message and suggested means of resolution will reference the <namefield> variable to help define what the message is associated with in the operation of the BuildBorderGroup funcition.
The following is a listing of all micromapST generated messages and a description of possible causes and solutions.
BuildBorderGroup Messages:
Debug Call parameter:
debug call parameter is not a numeric value.
The default value of <default debug> will be used.
The value provided for the debug call parameter is not a numeric value. It must a number between 0 and 65535. The debug parameter will be set to the default value of 0.
checkPointRestart
The checkPointReStart call parameter is not a
logical value.
The checkPointReStart call parameter must be a logical variable with a value of TRUE or FALSE. The default value is FALSE.
Check Point Restart has been requested. Check
point files
will be read from folder:
<name table directory>/Checkpoint directory.
The checkPointReStart call parameter was set to TRUE and the BuildBorderGroup function will attempt a restart using the <name table directory> and the Checkpoint subdirectory to locate the check point files. If valid, the files will be reloaded and the BuildBorderGroup process will be continued to completion.
General Call Parameter
Require call parameters are missing :
<list of parameters> - execution stopped.
BuildBorderGroup requires several call parameters to run. The <list of parameters> string presents to operate properly. The minimum set of call parameters is: NameTableFile, BorderGroupName, and ShapeFile. If no directories are specified either the value of the NameTableDir will be used or the current working directory. The function must have enough information to locate the initial name table (.csv or spreadsheet format) and the shape file(s) (.shp, .shx, .dbf)
Name Table Directory
NameTableDir call parameter is missing or NULL. Correct
and re-run.
The NameTableDir parameter must be a character string that provide
the directory path information to the directory containing
the Name Table file.
NameTableDir call parameter is an 'NA', Empty or not
a character string. Value= <NameTableDir> Correct and re-run.
The NameTableDir parameter must contain a character string
with length > 1 of an existing path to the directory that contains
the Name Table File. The value was found to be NA, ""(empty), or a
non-character variable. Correct the incorrect value and retry.
NameTableDir value specified does not exist.
Value= <NameTableDir>
The character string provided as the NameTableDir value is not a
valid path to the directory containing the Name Table File.
The path provided does not exist.
BorderGroup Files
BorderGroup directory specified in the call parameter
does not exist. It will be created. Value= <BorderGroupName>.
The BorderGroupDir parameter in the function call specifies a
non-existing directory.
The path provided will be used to create the BorderGroupDir directory.
The required BorderGroupName call parameter is missing.
The BorderGroupName parameter must be provided to be able to create
the final border group data set. Provide the required parameter and
rerun the function.
BorderGroupName is a 'NA', Empty or is not a character
string. Value= <BorderGroupName>.
The BorderGroupName parameter is not a character string or has a value
of NA. Neither can be used as the border group name. The value must be
a character string that is acceptable as a filename by operating system.
Inspect and correct.
BorderGroup directory specified in the call parameter is NA, Empty or not a character vector. The parameter will be ignored and the NameTable directory used. Value=<BorderGroupDir> The provided Border Group Directory is found to be NA, ""(empty), or a non-character variable. The parameter is ignored and the Name Table Directory is used instead. If the Border Group Directory was meant to point to a different location, correct the parameter and retry.
NameTable Files
NameTableFile parameter has not been provided.
Execution Stopped.
The name table filename for the name table data has not been
provided. Add the appropriate NameTableFile call parameter.
The name table passed to function on NameTableFile
parameter is not a valid variable type: <class of NameTable data>
The class of the NameTableFile parameter must be a data.frame or
a character string. It was found to be <class of NameTable data>.
Correct and rerun.
The NameTableFile parameter is set to 'NA', requires
a valid file name or structure.
The name table file parameter is set to a NA value. It must be a
valid data.frame or a character string indicating the name of the
name table file in the name table directory.
The NameTableFile parameter is empty. A name table
structure or filename must be provided.
For BuildBorderGroup to function, it must have the name or structure
of a valid Name Table. Without this, the border group can not build
a name table or complete the construction of a border group dataset.
Check on how to build a name table or check for typos in the
Name Table File value.
The NameTableLink call parameter is not a character
string. Fix and rerun.
The NameTableLink call parameter must be a character string that can
be used to reference a data.frame column read in from a .csv, spreadsheet
or R .rda and .RData files. Please inspect the link name and make sure
it is a valid column in the initial name table.
The NameTable file is not a .csv, Excel,
or R .RDA format.
The file containing the initial NameTable data must be a .csv,
Excel spreadsheet or R .rda file. Each column must be one of the data
columns required to construct the name table. (see name table section.)
Name Table Link
The NameTableLink call parameter is an 'NA', empty
(\'\' or \' \') or not a character string. Fix and rerun.
The NameTableLink identifies the name table column that will be
used to link the name table to the information and polygons in the
shape file structure. It was found have a value of 'NA',
Empty (\'\' or \' \') or was not a character type value.
The NameTableLink call parameter must be a character string that can be used to reference a data.frame column read in from a .csv, spreadsheet or R .rda file. Please inspect the link name and make sure it is a valid column in the initial name table.
Shape File and Directory
Shape File parameter has not been provided or is NA.
The filename of the shape file has not been provided in the
BuildBorderGroup function call. Review the function call and make sure
the ShapeFile parameter is present and identifies
the base filename of the shape file to be used.
The extensions of .shp, .shx, or .dbf do not have to be specified.
The base filename will be used as the layer to be read.
The ShapeFileDir is used as the 'dsn' value on the st_read call.
ShapeFile is a sf structure passed as a call parameter.
The ShapeFile call parameter is a sf structure.
ShapeFile is a SPDF structure passed as a call parameter.
The ShapeFile call parameter is a SPDF structure.
The ShapeFile call parameter is being used to pass
a full spatial structure to the function. However, the structure
must be a SPDF or a sf class. The data was: <ShapeFile class>
The ShapeFile call parameter must be a character string,
sf structure or SPDF structure. The class of the ShapeFile value
is <ShapeFile class> and is not usable. Fix and retry.
Shape file directory specified in the ShapeFileDir
call parameter does not exist. Value= <ShapeFileDir>
The directory specified in the ShapeFileDir call parameter does not exist.
The value was <ShapeFileDir>. Make sure the character string is a valid
path to the directory containing the Shape File and that access is
permitted. When no value is provided, the directory specified for the
NameTableDir will be used. If no NameTableDir is provided,
the current working directory is used.
Shape file (dir & name) does not exist.
Value= <ShapeFileDir & ShapeFile value>
The path to the shape file (directory and name) does not exist.
The either the directory and/or the shape file filename does not exist,
is not accessible or does not reference valid ERSI Shapefile collection
containing a x.shp file. The extension of .shp is added to the
path to validate the directory and filename for ESRI Shapefiles
collections. Only the base filename is required in the ShapeFile
filename. If present it is removed when the shape file is read.
Check the path <ShapeFileDir & ShapeFile value> and make sure
it is valid and can be accessed.
The ShapeFile call parameter is an NA, empty(” or ' '),
or not a character string. The filename of the Shapefile must be specified.
The provided ShapeFile name has a value of NA, is empty (” or ' '),
or is not a character string. A valid filename must be provide for the
ShapeFile name. Correct and retry.
ShapeLinkName parameter
The ShapeLinkName call parameter is missing. The default
value of NAME will be used.
For the shape file polygons to be used by micromapST, a linkage between
the name table and shape file must be established. The ShapeLinkName
call parameter is used to identify the shape file variable the function
can use to match data in the name table. The ShapeLinkName call
parameter is missing. The function will attempt to use the
default value of NAME to find a variable in the shape file.
If this is not correct, provide the ShapeLinkName parameter
with a value of the correct variable.
ShapeLinkName value is NA. The default of NAME
will be used.
The value in the ShapeLinkName call parameter is an NA value.
It must be a character string identifying the Shape File data
column/variable to be used to link the polygons to the
name table rows.
ShapeLinkName is not a character string.
Value= <ShapeLinkName>.
The value in the ShapeLinkName call parameter is not a character
string that can be used to access a variable in the shape file header.
Review the <ShapeLinkName> character string and correct.
Map Headers and parameters
The MapHdr parameter does not contain character strings
for use as the column headers. The MapHdr parameter will be ignored.
The MapHdr call parameter does not contain character strings.
Inspect and correct. The default of c("","Areas") will be used.
The MapHdr parameter must be a simple vector type.
The MapHdr call parameter must be a simple vector type variable with
1 or 2 elements. Data types of Data.frames, lists, tibbles, and other
advanced structures are not allowed. Correct and retry.
The MapHdr parameter has zero or more than 2 elements.
Only the first 2 will be used.
The MapHdr call parameter contains more that 2 elements.
For example: c('hdr1','hdr2','hdr3') Only the first 2 elements will be used.
It is suggested the max length of the MapHdr strings
be 16 characters.
The MapHdr values (2) should be less than 16 character to keep the
map type glyph columns from becoming too wide. Consider shortening
the character strings.
Map Minimum and Maximum Height
The MapMinH parameter does not contain numeric value.
Default Value is used.
The MapMinH (Map Minimum Height) parameter is provided, but does not
contain a numeric value. The default value of 1 inch is used.
The MapMinH minimum height value is out of range
(0.4 to 2.5 inch). The default will be used.
The MapMinH parameter value must be between 0.4 and 2.5 inches to be usable.
The default of 1 inch is used.
The MapMaxH parameter does not contain numeric value.
Default Value is used.
The MapMaxH (Map Maximum Height) parameter is provided, but does not contain
a numeric value. The default value of 1.75 inches is used.
The MapMaxH maximum height value is out of range
(1 to 2.5 inches). The default will be used.
The MapMaxH - value must be in the range from 1 to 2.5 inches. The default
value of 1.75 inches will be used.
The MapMinH value must be less than the MapMaxH value.
Will swap values.
The MapMaxH value must higher than the MapMinH value. The values are swapped.
ID Header and Parameters
The IDHdr parameter does not contain character strings
for use as the column headers.
The IDHdr call parameter must be a character string to be used
for the ID glyph column headers. The default header of the border group
name and Areas will be used.
The IDHdr parameter must be a simple vector type.
The IDHdr parameter value must be a simple one dimensional vector
with 2 value, like the MapHdr parameters. The default of the border
group name and "Areas" will be used.
The IDHdr parameter has more than 2 elements. Only
the first 2 will be used.
The IDHdr parameter has more than 2 values (elements) in the vector.
Only the first 2 values will be used for the IDHdr headers.
It is suggested the max length of the IDHdr strings be
12 characters.
To keep the ID glyph column from becoming too wide, it is recommended
the IDHdr string be limited to less than 12 characters each.
Reduction Percentage Parameter
The ReducePC parameter must be a numeric value. The default
of 1.25 % will be used.
The ReducePC (Reduce Percentage) parameter is not a numeric value. The
default value of 1.25% will be used. This parameter indicates how much
of the original spatial information will be kept when the geometry
is simplified by the rmapshaper package.
The ReducePC parameter is not a simple vector. The default
value of 1.25 % will be used.
The ReducePC parameter can only be a single value. The default value
of 1.25 % will be used.
The ReducePC parameter has more than one value. Only the
first value will be used.
More than one value is provided with the ReducePC parameter.
Only the first value will be used.
The value of ReducePC is out of range (0.01 to 100 %).
The default value of 1.25 % will be used.
The value of the ReducePC parameter must be between 0.01 and 100 percent.
It is out of range. The default value of 1.25 % will be used.
LabelCex Call Parameter
The LabelCex parameter must be a numeric value. The default
of 0.25 will be used.
The value of the LabelCex call parameter must be numeric. The default
of 0.25 will be used instead.
The LabelCex parameter must be a simple vector. The default
value of 0.25 will be used.
The LabelCex parameter value is not a single value vector but is a more
complex data type. The default value of 0.25 will be used.
The LabelCex parameter has more than one value. Only the
first value will be used.
The LabelCex parameter contains more than one value. Only the first
value will be used.
The value of LabelCex is out of range (0.05 to 10). The default
value of 0.25 will be used.
The LabelCex parameter value must be between 0.05 and 10. It is out of range
and the default value of 0.25 will be used.
proj4 Parameter
No projection provided in the shapefile or the proj4
call parameter,will be set to a Long/Lat projection.
No projection was found in the shapefile or structure provided the
function or in the proj4 call parameter. The boundary data is
assumed to be longitude/latitude values and the projection
in the spatial structure will be set to a Long/Lat projection.
The proj4 call parameter set to NA or Empty(”) or
is not a character string. The parameter will be ignored.
The proj4 call parameter must be a valid character string in the
proj4 projection syntax. A value of NA is not acceptable and the
proj4 call parameter will be ignored.
3304 The proj4 call parameter is not a valid
character string or 'crs' structure for a projection. The proj4 parameter
will be ignored. The default AEA projection will be used in needed.
The function was unable to process the provided proj4 projection character
string using the st_crs function. The parameter will be ignored.
A generic AEA projection based on the center of the map and its height
will be used to transform the map for use in the link micromap.
Invalid proj4 parameter value provided. Parameter
will be ignored.
When trying to process the proj4 projection string using the
sf package st_crs function, a error was raised. The proj4 string
can not be process properly and will be ignored.
The proj4 call parameter specifies a long/lat
projection.
proj4: <proj4>
The final projection can't be a longlat projection.
The proj4 parameter is ignored and a AEA projection will be created.
The proj4 call parameter specifies a long/lat projection (see <proj4>
for details). The final projection of the map for linked micromaps
should not be long/lat. Therefore, an Albers Equal Area projection
is calculated based on the centroid of the map with secondary
latitudes 1/4 the maps height above and below the centroid.
The proj4 call parameter does not have +units=m,
changing string to meters.
The proj4 call parameter projection does not have a component of +units=m
to set the projection to meters as required. Since this projection
will be the final projection of the map, the +units has been changed
to m to force the final projection to be in meters.
Shape File Processing and Call Parameter
The Shape file SPDF structure was passed to the function
in the call.
This message is an informational message to document the shape file
(as a SpatialPolygonsDataFrame) was passed to the function instead
of passing the directory and name of the shape file.
Reading shape file from dir: <SFDir> file: <SFName>
The function is going to read the shape file from the directory <SFDir>
and layer name <SFName> using rgdal\'s readOGR function
with verbose = TRUE.
The sf st_read can not import the shapefile as specified.
The errors reported were: <res1> <res2>
In the attempt to read the shapefile, the function attempts to use the dsn= parameter on the st_read for the directory to the shape files and the layer= parameter to specify the specific shape file. This is one of two modes the st_read function can use depending on the driver it selects. If this mode does not succeed, the function will attempt to use the mode where the layer= is not used and the dsn= parameter must contain the entire directory and filename, minus the extentions. If either are successful, this message is generated and the two error messages are reported in <res1> and <res2>. Review the error messages and correct the directory name used, the filename referenced and make sure both exist.
Spatial Driver found in boundary data file read
was <SReadDriver>.
The <SReadDriver> was identified during the st_read of the Shape File.
The shape file structure was passed to the function
in the call.
The type of data passed to the function in the ShapeFile parameter
indicates the user is passing a spatial structure rather than a filename.
The type of structure will be validated.
The spatial structure passed to the function via the
ShapeFile parameter is not a SPDF or a sf full structure.
Please correct and try again.
If a spatial structure is passed to the BuildBorderGroup function,
it must be a SPDF or sf structure containing the data information.
The data information is required to be able to match the
polygons in the structure to the rows in the name table.
The projection field in the shapefile is empty,
set to <OrigLongLat>
The proj4string slot in the SpatialPolygonsDataFrame for the shape
file is empty. The projection for the shape file is assumped to be
a long/lat projection and will be set to a long/lat projection of <OrigLongLat>.
Checking Shape Link Name Column: <ShapeLinkName>
Checking to make sure the shape file data.frame header contain a variable by
the name of <ShapeLinkName> as specified
on the function call.
The ShapeLinkName provided: <ShapeLinkName> does not exist in
the shape file data.
The shape file variable <ShapeLinkName> does not exist in the data.frame header.
Check for the existence of the variable and for possible spelling errors and return.
Shape file link variable name is valid, values will be
cleaned up and stored in variable X__Link.
The specified Shape File link variable name was found in the
SpatialPolygonsDataFrame. Its contents will be cleaned up and saved
in variable X__Link for later use.
Name Table General Issues
The Name Table was read from: <NameTableType>
<NameTablePath>
The initial name table file will be read from <NameTablePath>
containing the directory and filename. The name table file type
is <NameTableType>.
The Name Table Path provided to read the Name Table does not exist. Value = <NameTablePath>. The Name Table Directory and Filename to read the Name Table does not exist. Please check the Name Table file's location and try again. The Name Table must be accessible to run the BuildBorderGroup function.
The NameTable in the .rda file is not a data.frame.
Please correct and retry.
The initial name table can be constructed and saved in the R .rda
format file. When read, the contents of the file is not a data.frame
structure and cant be used. Research the cause and correct.
There are more than one data.frame in the .rda
file provided for the NameTable. Provide only one data.frame table
in the .rda and retry.
When the .rda file was opened for use as the name table data.frame,
more than one was found. The .rda should only contain one data.frame
for use as the name table.
Name Table Link column is: <NameTableLink>
The function will use the <NameTableLink> column to pair up the name
table rows with the collections of polygons in the shape file.
The Name Table has no columns of data.
The name table as read, does not contain any data columns or meaningful column
labels. Research and correct.
The Name Table has no rows or areas.
The name table as read, does not contain any area rows. There must be one
row per area in the map. Research and correct.
The column specified in the NameTableLink calling parameter
does not exist in the loaded Name Table.
The value specified in the NameTableLink call parameter is not the name of a column
in the read initial name table. If the Link column exists, it will be used as
the NameTableLink column to match the name table areas to the shape file polygons.
At least one of the following columns must be present in the
NameTable file: <List of Column Names>
Please correct the spreadsheet and
try again.
One of the columns named in <List of Column Names> must exist in the name table.
If none of the columns exist, then the initial name table is not valid and needs
additional work to permit the building of a border group.
The following columns are not needed and will be deleted from the
Name Table:
Extra columns were found in the name table. These columns are listed below and
will be deleted. Check for typos to make sure all of the required information
is retained.
Name Table Name, Abbr, ID, Alias and Alt_Abr data errors
The <inx> column contains duplicate entries. Correct and retry.
The column identified by <inx> was inspected and found to contain duplicate values
(entries). Duplicates are not allowed in location ID columns that must be unique
for each row. Research the values in column <inx> and correct.
The <inx> column contains NA or blank values that are not
allowed.
The column identified by <inx> contains blank values or NA values. These values
are not permitted in location ID type columns. Research and correct the values.
The Abbr column in the Name Table is not included.
Will attempt to backfill it from other information.
The Abbr location id should be supplied in the name table whenever
possible. If it does not exist, it will be created using other
information provided in the the name table.
The Name Table Abbr field is persent - no backfill
required.
The name table abbr field was present. Processing continues normally.
Some of the Name Table Abbr values are longer than 5
characters. It is recommended the Abbr values be keep short.
It is recommended the character string values in the Abbr column be kept
to 5 characters or less. This keeps the location ID fields in the data
smaller and less work for the preparer and keeps the ID glyph abbr
label narrower to more statistics type glyphs to be presented.
None of the columns needed are present. The Link and
one of the Name, Abbr, and ID column should have been there. This should
never happen with the previous checks.
If this error message occurs, then something has happened during
the processing. This situation should have been caught earlier.
Research the name table and make sure all required columns are present.
Checking ID column in the name table to make sure the
values are numeric with leading zeros.
While the ID values of location ids are numeric, a lot of them have
leading zero. The IDs are check to ensure they are numeric, then
converted to character format and leading zeros added to help in
later comparisons.
The ID column is not present. A numerical sequence number
has been used to fill the column.
No ID colomn is present in the name table. An ID column consisting of
a sequence of 1 to n is used.
The ID data column is not all numeric values. Values
will be assigned.
Not all of the ID values are numeric or valid. The rows with bad
ID values will be replaced with new values.
Name Table L2_ID_Name column contains row(s) with NAs
or ” values. L2 feature disabled.
All of the rows in the Name Table L2_ID and L2_ID_Name columns must
have values for the L2 Feature to operate correctly. Inspect these
columns in the name table provided and correct.
Name Table L2_ID column contains row(s) with NAs
or ” values. L2 feature disabled.
All of the rows in the Name Table L2_ID and L2_ID_Name columns
must have values for the L2 Feature to operate correctly.
Inspect these columns in the name table provided and correct.
Name Table regName column contains row(s) with NAs or ” values. Region feature disabled.
All of the rows in the Name Table regID and regName columns must have values for the
Region Feature to operate correctly. Inspect these columns in the name table provided and correct.
Name Table regID column contains row(s) with NAs or ” values. Region feature disabled.
All of the rows in the Name Table regID and regName columns must have values for the
Region Feature to operate correctly. Inspect these columns in the name table provided and correct.
The Region Feature has been disabled. The number of regions defined
is either 1 or is equal to the number of areas.
For the Region feature to operate, there must be more than 1 region and less than the number
areas. If equal to 1 or equal to the number areas in the map, the Region function has no
value and is disabled.
Name Table Modification Parameters
The Name Table in the <inxRN> area row and in the <inx> column
is not numeric and has a bad value of: <WrkVal>. Value set to the default. Fix and retry.
The data value in the row named <inxRN> for column <inx> has a value <WrkVal2> that is
a non-numeric value. Correct and rerun.
Data in row: <inxRN> for <inx> parameter <WrkVal2> is out of range.
(<low> to <high>)
The data value in the row named <inxRN> for column <inx> has a value <WrkVal2> that is
out of range. The acceptable range is from <low> to <high>. Correct and rerun.
Map Labels and Coordinates
The MAPL column in the name table is not character data. Labeling
will not be done.
The map label must be a character string value. Labeling for the row will not be done.
If the MapL column is present with label(s), then MapX and MapY
must be present. One or the other is missing. Labeling is not done.
For each Map Label in the name table, there must be a valid MapX and MapY value
specifying the location on the map to draw the label. If all three are not present,
the label for that area will not be drawn.
MapL value is present and there are no valid MapX coordinates.
Labeling will not be done.
A map label for an area is present, but the MapX coordinate is missing or invalid.
The label will not be drawn. Correct the MapX and possibly the MapY values.
MapL is present and there are no valid MapY coordinates.
Labeling will not be done.
A map label for an area is present, but the MapY coordinate is missing or invalid.
The label will not be drawn. Correct the MapY and possibly the MapX values.
The MapL label - <label> - for area <area name> should be 3 or less characters
to be usable.
It is recommended the map label string be shortened to 3 character or less to reduce
the space required to draw the label.
No MapLabel content - processing skipped.
The retired MapLabel column in the Name Table does not have any values. Converting
the MapLabel column into the MapL, MapX, and MapY columns will be skipped.
Some of the items in the MapLabel entry for <name table row>
are NA or blanks. Will be ignored.
The MapLabel entry for the <name table row> is blank or NA and can not be used.
The value is ignored.
The MapLabel value for <name table row> is not valid. Must be a
character string with three values separated by commas. The value is ignored.
The character string in the MapLabel field must consist of three values separated by
commas. The first value is a character string (the label) and the other two columns
are numbers representing the X and Y coordintates to draw the label. The X and Y
coordinates are in the units and origin of the original shape file.
The label value in the MapLabel entry for <name table row> is > 3 char.
Only first 2 characters will be used.
The recommend length of the map labels is 2-3 character. One or more of the labels are
greater than 3 character. Adjust the length of the map labels.
One of the coordinates in the MapLabel entry for <name table row>
is/are not a number. <MapX> or <MapY>
Check the MapX and MapY coordinates values, they must be numerical values.
One of the MapLabel coordinates for <name table row> are out of range.
Entry ignored.
The values for the MapX and MapY coordinates must be within the range of the space covered
by the defined map in its original projection. If the values are outside of the map's area,
this error will appear. Inspect the coordinates and adjust as needed.
Shape File Projection
The projection provided in the Shape File does not have +units=m,
modify and setup for re-projection to change to meters.
The final projection must be in meters. The projection in the Shape File or
structure is not meters. A new projection based on the original is created
with +units=m. This projection will be used for the final transformation of the map.
Found +units=m in projection string of non-longlat projection
in the shape file.
The projection in the shapefile is not a long-lat projection and it is already in units of
meters. No modifications are required and no transformation will be needed.
Boundary data clean up
Cleaning up polygons in spatial structure.
The spatial structure will be passed through sf st_is_valid and st_make_valid
functions to clean up the polygon geometries.
Shape File contains invalid polygons. The indexes and associate reasons are\:
This is an informational message indicating the shape file geometry will be passed through
the sf package validation and make_valid function to clean up the geometry. S2 functions
are not used.
Matching Shape File to Name Table
Comparing shape file to name table links
The link values in the shape file identified variable are being compared to
the values in the name table identified column. If there are any
polygons that do not have a matching name table entry, the polygon(s) will
be deleted.
The following Shape File areas are not in Name Table:
<List of Shape File areas not in Name Table>
The areas will be dropped.
Any area (polygon) in the shape file that does not have a row in the Name Table, will
be dropped from the final border group map.
Linking Name Table to Shape File
Comparing the link values to tie the name table to the shape file.
This is the reverse comparison to comparing shape file links to the name table links.
If a name table area/row does not have a matching shape file set of polygons, then
the functions execution will be halted. Either provide the polygon(s) for the
area included in the name table or remove the name table entry.
The following Name Table areas do not have boundaries in the ShapeFile:
<List of Name Table Rows without polygons>
Correct and retry.
The BuildBorderGroup function can not continue if there is not an area(set of polygons) in
the shape file for the Name Table row. Execution stopped.
Some of the polygons in the Shape file still do not belong to
areas in the name table. Polygon(s) are ignored.
Not every polygon is linked to an area in the Name Table. These polygons will
remain in the map, but will be ignored. If multple ignored polygons share the
same boundary, the boundary may not be drawn.
Shape File Simplification
Simplifying the shape file boundary data with rmapshaper.
This message is a progress report to indicate the processing has called rmapshaper to
simplify the spatialpolygon data.frame (shape file) to the ReducePC specification
with a weighting of 0.9. Check the results to see if the map is over simplified or not
simplified enough. Change the ReducePC call parameter and rerun until the map becomes usable.
rmapshaper parameters before simplification : Keep= <MS_Keep>
Weight= <MS_Weighting>
The rmapshaper ms_simplify function parameters for the MS_Keep and MS_Weighting
are listed to document the amount of boundary reduction and smoothing. See the
rmapshaper ms_simplify documentation for more details.
rmapshaper processing completed.
The rmapshaper ms_simplify function has completed its work and has returned a
modified spatial structure.
Map area aggregation completed.
After the rmapshaper ms_simplify completes its work, the polygons in the spatial
structure are aggregated into multipolygons to create one rows per area
in the sfc structure. This message signals the aggregation step has been
completed. The number of features in the sf structure are now equal to the
number of areas in the name table.
Area Size Inspection and Name Table Modifications
The following areas may be too small (<0.03%): <ListOfAreas>
In general, if an polygon is less than 0.03
shaded for linked micromaps, it may not be visible to the graphs users. The list
of areas that may be too small are provided in the <ListOfAreas> string
in the message. Review the list of areas and a plot of the map and determine
if the areas should be enlarged or moved or manually manipulated to ensure their
shading can be seen easily. checkPointReStart call parameter allows the user to
intercept the shape file before the VisBorders data.frame format datasets are created
to make custom changes.
Info:No modifications are required to map.
The name table did not contain any shift, scale or rotate parameters for any areas
in the maps. No modifications were done.
Name Table Modifications
Identifying neighbors for each area.
The function is now searching and identifying the neighbors for each area in the map.
This information is used in the color selection algorithm to keep two areas that
share the same boundary line from being colored the same color in the examples
and any future feature that is added to micromapST.
The list of neighbors for each area are in the name table 'NB' column.
The neighbor list is stored in the 'NB' column of the name table temporarily. This will be
save across the checkpoint restart for future use.
Info:No modifications are required to map.
The name table did not contain any modifications for areas in the map.
Area: <area key> will be adjusted using the <modification list> values.
<area key> area has parameters to have shifts, scaling and/or rotate modification done to the
boundaries.
Xoffset: <Xoffset> Yoffset: <Yoffset> Scale: <Scale> Rotate: <Rotate> in radians: <radians>
The name table contains the following values for each of the possible modification for the
identified area in # 3811.
Re-inserting area polygons for <area key>
The area <area key> has been modified and is being re-inserted into the map and
neighboring areas.
plot of original area before modifications in blue:
A screen plot was drawn of the area before modification with blue lines.
plot of main area after modifications in green:
A screen plot was drawn of the area after the modification with green lines.
Original neighbor boundaries in magenta for <neighbor area>
A screen plot was drawn for an neighboring area to the modified area before
adjustments were made with magenta lines.
Trimmed neighbor boundaries in seagreen for <neighbor area>
A screen plot was drawn of an neighboring area to the modified area after trimming
adjustments were made with seagreen lines.
Name Table modificiations to Shape file are completed.
All of the name table specified area modification has been completed.
Area Coverage Analysis
These area coverage estimates were done after the name table
modifications. The area's coverage may have be increased or decreased from the
original Shape file. This coverage review is done against the current area's
coverage after all modifications. If an area's coverage is less than 0.057%
of the total map coverage of <TotalArea> * 0.00057 m^2 may not be large enough
to be visible in a in a linked micromap graphic. The area(s) that should
be reviewed is(are):
<list of areas that may be undersized.> This message provide a list of the areas that may not be visible when willed with the link color in the linked micromap. Investigate and adjust the area size as needed.
All of the area appear to be big enough to show the shading
in a linked micromap.
Verify all of the areas can be seen when the linked micromap color shades the
area.
Final processing and transformation
Transforming projection of Shape file and label points.
This is a progress message to indicate, the function is about to perform any
requested or needed project transformations on the shape file and the label points in the
name table. If the resulting projection is not correct, check the value provided
via the proj4 call parameter and whether the default AEA projection based on the center
of the map is correct.
Using user provided projection: <proj4>
This message outputs the projection string provided by the BuildBorderGroup caller via the
proj4 calling parameter to help document the border group.
Re-transforming shape file using original projection, with +unit= changed to meters.
The original projection provided on the shape file did not have the +unit= proj4/6 parameter set
to meters. BuildBorderGroup assumes the projection of the shape file just prior to the conversion
of the boundaries to the VisBorders format is meters. The proj4 call parameter or the projection
contained in the shape file was inspected and a new projection created with the +unit= paremeter set to
m was created and used in the maps transformation before the VisBorders data.frames are created.
Projecting shape file using created AEA projection.
Since the shape file is a long/lat projection and no projection was provided via the proj4 call parameter,
the BuildBorderGroup function will calculate an Albers Equal Area projection based on the centroid
of the mapped area and secondary latitudes at 1/4 the map's height above and below the center latitute.
This results in all of the areas being presented with equal area (or shading area).
Transformations completed.
All of the transformation and projection modifications have been completed.
Info:No transformation was done to the map.
No transformations were required on the final map. The original projection is still
in use.
The proj4 value character vector is not a valid projection.
Must be a value acceptable to st_crs(). proj4 is ignored.
Verify the proj4 call parameter is correct and has the right syntax for the proj4 projection
specification. The proj4 parameter will be ignored.
Generating Test Plot
The length of name table and the number of areas in the shape file are different.
The number of areas listed in the name table and the number of areas with polygons in the shape file
are different. Either the name table has more entried than the shape file or the shape file has
more areas then the name table. Correct and re-run.
checkPointRestart Writes
The checkpoint files will be store in the Checkpoint directory : <CkptPath>
The directory used to save the check point files is <CkptPath>.
Checkpoint - Name Table RDA Filename: <NTCkpt> Name Table CSV
Filename: <NTCkptcsv>
The check point Name Table data.frame is save as a data.frame (.rda) in the <NPCkpt> file
and as a .csv file in the <NTCkptcsv> file.
Checkpoint - The shape file spatial data is written to: <SFCkpt>.
as a set of ESRI Shapefiles. The RDA image spatial data is in: <SFCkptRDA>
The check point image of the shape file is saved in ERSI Shapefile format in the
<SFCkpt> shape files and as a SpatialPolygonsDataFrame in the <SFCkptRDA> RDA file.
Checkpoint - areaParms data.frame in: <APPCkpt> as <APCkpt>
The control areaParms dataset has been checkpointed in the <APPCkpt> directory
as the <APCkpt> filename.
BuildBorderGroup has completed writing of the checkpoint files to disk
for possible editing and restart. They are located in the following directory :
<Check Point Directory>
The border group dataset has been successfully written to directory included in this
message and is ready for use, reuse, or editing.
The check point Shape File for the border group is saved to: <Shape File Name>
This message identifies the filename of the saved Check Point Shape File being used to build this
border group. It can be editted, but must be returned to the same directory with the same
name for the checkPointRestart logic to restart the border group building process.
After editing, the results must be saved back to the same directory and filename.
The check point shape file can be manually edited to make areas more visible in the small micromap.
Once the modifications are completed by any external GIS or polygon editor, the final shape file
must be save in it's original directory and under the original filename. If you want to save
a copy of the shape file producted by the BuildBorderGroup function, save it before making modifications.
Checkpoint Reload to Continue
Check Point Restart Process Initiated.
This is an informational message letting the user know the checkPointReStart process has been
initiated and all of the working files have been reloaded into the function.
No Border Group or Name Table directory provides. Cannot find restart files. STOP. The name table and border group directories provided for the checkPointReStart could not be located. Make sure to provide the same directories and file names used in the original BuildBorderGroup function call.
NameTable directory: <NameTableDir>
The name table directory being used for the check point restart is <NameTableDir>.
Using the BorderGroup directory: <BorderGroupDir> to locate the
checkpoint files.
The name table directory was not provided. Since the target output directory was the
BorderGroupDir path, the BorderGroupDir call parameter will be used to locate the
checkpoint folder and files.
Reading areaParms dataset: <areaParm.rda file>
The restart logic in the checkpoint feature is loading the areaParms.rda
file from the <areaParms.rda file> path.
Reading shape file: dsn=<Shapefile Directory> layer=<Shapefile base name>
The restart logic in the checkpoint feature is loading the saved shape file from the
checkpoint folder <ShapFile Directory> and the layer is the <ShapeFile base name>. These are
the checkpoint folder locations not the original Shapefile directory and file parameters.
Reading NameTable: <NT filename>
The restart logic in the checkpoint feature is loading the checkpointed version
of the name table (areaNamesAbbrsIDs) from the checkpoint folder.
The Shapefile has been modified and the 'X__Key' variable has been removed.
Rerun the BuildBorderGroup function to restore the variable.
Then do not remove the variable when editing the shape file.
BuildBorderGroup places a couple of special variable in the shape file to allow the polygons to be
matched up with the areas in the name table. It appears the 'X__Key' variable as been removed.
Editing is permitted, but none of the 'X__' variables added by the BuildBorderGroup should be
removed. Rerun BuildBorderGroup to restore the variables, do the required editing, and then
do a checkPointReStart = TRUE to build the micromapST boundary files.
Creating micromapST boundaries files
The shape file variables have been editted. The 'X__Key' variables are
missing. Redo the edits and do not delete the 'X__Key' variable.
During the editing of the Shape file, the 'X__Key' variable must be kept the same to allow the
Name Table to maintain a link between the area location ids and the associated polygons.
Reedit the Shape File and ensure the X__Key variable is untouched.
The shape file row.names in the spatial structure do not match
the 'X__Key' variable values. Investigate and correct cause. row.names are reset to the
'X__Key' values.
The contents of the X__Key variable/column. If they are changed, the tie between the
Name Table and the spatial data is broken. Reedit the Shape file and return the X__Key column
to its original value written when the checkpoint file was created.
Invalid polygon found in <area key> area #:<nz> id:<z>
Message is issued by the BuildVisBorders function as it is process the spatial
data and converting it into the data.frame format for micromapST drawing. The message
indicates an invalid polygon was identified for the <area key> at row number <nz> with the
id of <z>. Identify the bad polygon and fix its geometry.
Completed conversion to VisBorders format.
All of the boundary data for the individual areas, Layer 2 boundaries (if requested), regional
layer boundaries (if requested), and the map outline have been converted to the data.frame
point format required by micromapST to draw the miniture maps.
Creating the 4 micromapST boundary layers (area, L2, L3, and Regions).
The shape file will be copied to the area, L2, Regions, and L3 shape file images and
merge based on the spaces layed out in the name table for L2 spaces, Regional spaces, and the
outline of the entire map (L3).
Completed conversion to VisBorders format.
The conversion from spatial structure to micromapST's data.frame format has been completed.
Writing an images of each Border Group data.frame for <BGBase>
A single R .rda file will be written for each of the 6 data.frames included in the
border group <BGBase>. The can be found in the border group directory under the
names of <BGBase>_<name of data.frame>.rda.
Border Group Created - Successfully.
The writing of the border group dataset has been successful. The border group is now
ready for use.
Summary build report of names, abbr, id and other data is written
to <RepOutFileName>
Each border group needs documentation to provide the user to let them know what the list of
Names, Abbrs, IDs, Aliases, and Alt_Abr location IDs are available in the Name Table for the
space mapped by border group. A copy of the key information from the Name Table is printed for
use in this documentation in a text file in the border group directory under the name of
<RepOutFileName>.
Border Group: <finalBGroup> is done.
The process of gathering the information, validating it, editing it, and converting
in to a format for micromapST has been completed and the dataset written to disk.
The proj4 value character vector is not a valid projection. Must be a value
acceptable to st_crs. proj4 is ignored.
The projection character string provided on the proj4 call parameter is not a valid.
It can not correctly processed the sf function st_crs to be used as a projection.
Make it complient with PROJ4 speciications. Correct and re-run. (convertProj4)
AdjPolygons - Polygons level value is not a Polygons structure.
In processing the SpatialPolygonsDataFrame image of the shape file, the Polygons
level below the polygons level is not a valid SpatialPolygons structure and
can not be processed. Execution is halted. Review the shape file or
SpatialPolygonsDataFrame structure and make sure it is correct.
The number of areas in the map is 1 or less. A border group can
not be made.
The number of areas in the name table and the shapefile must be more than 1 polygon or area to
be able to build a border group. Check the name table and shapefile contents.
Errors have been found and noted above. Execution stopped.
Please fix problem(s) and retry.
Inspect the previous messages and errors to identify the cause of problems that stopped
the execution of the function. Fix the problem and rerun.
The border group has been created. To make sure the user can use the same location ids that were used in the name table, the following table is printed to provide documentation on what location IDs are available to the user when they are assemblying their data. Each column present in the name table is listed: Name, Abbr, ID, Alias, Alt_Abbr. These are the primary columns used by the data gatherer.
If only the regional space information is present in the name table, then the regID and regName name columns will be displayed in the report for reference.
If Map Labels and name table modifications were specified for any area in the name table, The MapL, MapX, MapY, Xoffset, Yoffset, Scale and Rotate columns of the name table are listed in this section for later reference. If neither Map Labels or modifications were used, this section of the report is not outputed.
If only Map Labels were implemented in the name table and not name table modifications, only the values for the MapL, MapX, and MapY name table columns are displayed in this section of the report.
If no Map Labels were specified in the name table, but name table modification were specified, then this section of the report will list only the name table modification values for future reference.
If the name table contains Layer 2 and Regional space information for use by micromapST, then this section of the report will be outputed listing the L2_ID, L2_ID_Name, regID, and regName name table column information for later use. If the L2 and regional information is not present this section of the report is not displayed.
If only the Layer 2 information is present in the name table, then the L2_ID and L2_ID_Names name table columns will be displayed in thei report for reference.
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
micromapST verifies as much of its operation and the data provided by the user to try and identify and document to the user problems before they cause an R warning or error exception and throw a R like cryptic message. Each message, warning and error message is documented with the general form of the message and a friendly explanation and advice on what may be wrong and how to fix the issue identified.
The micromapST messages all start with "***" to help quickly find them in the warnings() logs and general output.
The general format of the messages is:
***XXXX NNNNNN text of message or ***XXXX NNNNNN CC text of message
where the XXXX is the message alphanumeric identifier, NNNNNN is the name of the function or glyph issuing the message and CC is the glyph graphic column number in the panelDesc structure.
The four alphanumeric message identifier following the "***" is a unique message identifer to help find the explanation in this document and to help discuss problems over the email more accurately. It's always best to include the log of the preparation and execution of micromapST as well as the data used when requesting help.
The first two digits/characters of the message identifier indicates which logical segment of the package generated the message:
Main package initialization, startup, call argument validation, user data validation and panelDesc structure and contents validation.
panelDesc and glyph parameters and data validation, processing and graphic generation
Border Group data and structure validation and setup,
Internal Operation
Informational Messages
The second field of the message contains the name of the section of code generating the message. If a glyph generated the message, the name of the glyph (type=xxx) and the panel column number are displayed to help you focus on the parameters and data that may need to be inspected. Each message contains specific information (like panelDesc named list, checked value, etc.) to help identify the cause of the message and help lead the user to a fix.
The remainder of the message contain a text explanation of the issue and a list of parameters: glyph column number/name, data row or column, variable name, and value related to the issue.
The abbreviation in most messages identifies the processing step in the package the message was produced:
Call Arguments-Parameters
General call parameter messages
Row Name Column
Row Name type
Ascend sort order parameter
Title parameter
PlotName parameter
Group Pattern parameter
Scaling parameter
Staggered Label parameter
Column Size parameter
Statistics Data Frame parameter or content
Panel Description data frame parameter or content
Sort parameters
Use of the Alias location ID feature
Border Group Directory parameter
Border Group Name parameter
Border Group data set content
General Border Group boundary data
color call parameter
details call parameter
Panel Description dataframe columns. These messages relate to the columns within the panel description data frame rather than the glyph itself.
Panel Description dataframe general issues
Internal issues
Messages related to a specific glyph graphic. The <glyph name> would be one of the following: dot, dotconf, dotse, dotsignif, ctrbar, normbar, segbar, map, mapcum, mapmedian, maptail, id, arrow, bar, boxplot, scatdot, ts, or tsconf.
Conventions: The user provides the data.frames for the statsDFrame and panelDesc structures. In this document, there variables are represented by the generic terms of statsDFrame and panelDesc. In the messages produced by the package, these values are replaced with the names of the user variables provided in the micromapST call.
The following values are also substituted with the real run values at time of execution:
List of column names/numbers used in the sortVar call parameter. Used to identify invalid statsDFrame column names and numbers used in the sortVar call parameter.
panelDesc variable name. The panelDesc variables include lab1, lab2, lab3, lab4, refVal, refTxt, col1, col2, col3, type, and panelData.
panelDesc glyph column number. This helps identify which graphic column (glyph) is being processing when the error/warning is signaled. The far left column number is 1.
a list of variables from the panelDesc data.frame. Used in messages to identify one or more panelDesc variables that are invalid.
detailed list variable name
last variable name used. Used to locate a value or absents of a value in a list by using the last good variable name.
a character string of the name of the border group being referenced. The border group names does not include the .RData or .rda extension used on the data file.
detailed list variable name
The labeling method requested for generating Axis labels.
A list of the sub-areas (counties, etc.) that did not have data.
list of row.names (sub-areas)
The literal rowNamesCol value submitted by the caller.
The class of the variable names "xxx"
Number of items in list or vector.
List of invalid names that were unexpected.
a general list of options, variables, or names.
The name of the data.frame used as the first call parameter. This data.frame is used for the statsDFrame and provides the numeric values for the plots.
The name of the data.frame serving as the panelDesc panel column description.
The name of the glyph being drawn.
The name of the glyhp begin drawn.
The name of statistics data.frame for the linked micromap.
The panelDesc data.frame column number (glyph column).
Is the list of duplicate entries in a list.
References the row number in the statsDFrame data.frame.
Is a list of rows related to the message.
One or more row numbers related to the message.
Identifies a not used name from name table.
Reserved for future usage.
The following is a listing of all micromapST generated messages and a description of possible causes and solutions.
Initialization, Setup and Call argument validation, panelDesc and User data validation. where y indicates the specific argument or area.
General Operation ("Z"):
CARG Key call arguments are missing, NULL, wrong type, or NA, Execution stopped.
The micromapST function requires the statsDFrame and panelDesc data frames be provided. This is the minimum set of call parameters required. If either or both parameters are missing (the first two in the function call), the execution of the function is stopped.
CARG Warnings have been found in the parameters and data. Package continues, but results are unpredictable. Review log and fix errors.
The micromapST function has detected warning in the parameters and data. The function will continue but the results may be unpredictable. Review the errors and warnings, correct the causes and re-run.
CARG Errors have been found in parameters and data. Please review program log and fix problems. Packaged stopped.
The micromapST function has detected errors in the data and call parameters that prevent the function from running. Correct the identified error and re-run.
statsDFrame call argument (1st argument) (CARG-DF):
CARG-DF First argument (statsDFrame) is missing or NULL or not a data.frame.
The first calling argument/parameter must be the statsDFrame data frame containing the data to be graphed. If the first argument/parameter is missing or NULL. Provide the correctly built statsDFrame and re- run. The data structure can not be a list, tibble, matrix, array or simple vector variable.
CARG-DF The First argument (statsDFrame) is not a data.frame class.
The statsDFrame variable holding the data for each area (county, district, etc.) in a border group is not a data.frame structure. Each column represents a data variable to micromapST. Each row represents an area being mapped and charted. The area location ids may be located as the data.frame row.names or in a data.frame column itself. The panelDesc structure uses the col1, col2, and col3 columns to specify which column in the statsDFrame should be used for the glyph. The sortVar and rowNamesCol call parameters use the statsDFrame column name or number to identify how to sort the graphics or where the area name is located in the data.
CARG-DF The <sDFName> statsDFrame data.frame must have at least 1 or 2 columns.
The <sDFName> data.frame provided as the <statsDFrame> must have at least one column or two if the rowNameCol call parameter is specified. One data column must be provided and if the row.names feature in the data.frame is not used for the location id, then one additional column is required and the rowNameCol must be specified.
CARG-DF There are duplicate entries in the statsDFrame data.frame. Duplicate entires are ignored.
The statsDFrame data frame contains rows with duplicate names. Only the first row is used, duplicates are ignored. Check the area name values in the data frame and correct as needed to make sure each row represents a unique area. The following message provides a list of the duplicate sub-areas. ***
CARG-DF The duplicate rows are: <listRowNames>
This message provides a list of the duplicate area names, abbreviations or IDs in the statsDFrame data frame. Correct the duplicate entries and re-run micromapST. ***
CARG-DF The following row in the <sDFName> statsDFrame data.frame are not found in the name table:
See continued message (0107) for details
CARG-DF <listRowNames>
This is a two line error message. The area names provided
in statsDFrame data.frame must match the names in the
border group name table. This warning provides a list of the
araa names not found in the name table. Please select the
correct border group or correct the area names in the
data.frame and re-run micromapST. Unless requested,
the function will stop.
CARG-DF The rows not matched to boundaries
will be removed and not mapped.
If a row of data does not match an entry in the name table, the
data row in the statsDFrame will be deleted. Check the
location id used on the data row to make sure it matches the
type of names you selected in the name table ("Full", "Ab", "ID",
etc.) If the row's loc id does not match, correct and re-run.
CARG-DF Data row names in the ,sDFName," data.frame
must match the location ids in the name table. Call stopped.
If ignoreNoMatch is FALSE and a data row does not have
an entry in the boundary name table then the package can not
continue to run. All data must have valid names or abbreviations
associated with each row that has sub-area boundaries in the
border group. Execution is stopped. Verify the area identifiers
in the supplied data, correct and retry.
panelDesc call argument (2nd argument) (CARG-PD):
CARG-PD The second call argument, the panelDesc
structure, is missing or NULL.
The second call argument/parameter is the panelDesc
data.frame. Without this data frame, micromapST does not know
which glyphs to draw or where the supporting data is located
in the statsDFrame data.frame. The data frame is missing,
NULL, NA or is not a data.frame structure. The data structure
can not be a list, tibble, matrix or array variable. Correct
and re-run micromapST.
"CARG-DF The second argument, the panelDesc
structure, is not a data.frame.
The second argument in the micromapST call must be the panelDesc
structure as a data.frame. No other type of variable is acceptable.
Please correct the structure and retry.
CARG-PD The following named lists in the
<pDName> panelDesc data.frame are not valid:
<PDVarList>
The panelDesc variables col1, col2,
and col3 identify the columns in the statsDFName
data.frame that contain data for each area to be used by
the glyph for the column. The column number or name
provided in the panelDesc structure is not a valid
statsDFrame column number (out of range) or name.
This error message provides a list of the invalid column
numbers or names.
CARG-PD The required type named list
is missing in the <pDName> panelDesc data.frame.
The panelDesc list named type does not exist in the
panelDesc data frame. This named list is required to
identify the type of glyph to be generated for the specific
column. Inspect the panelDesc data.frame, check for
spelling and correct the source of the problem and re-run.
Example: panelDesc=list(type=c("map","dot","bar"), ... )
CARG-PD The <panelDesc> type named
list contains one or more invalid glyph name(s): <pdTypeList>.
The type named list in the <panelDesc> panelDesc data frame
contains invalid glyph names. The invalid names are listed in
this error message. The valid glyph names are: map, mapcum,
mapmedian, maptail, id, arrow, bar, boxplot, dot, dotse,
dotconf, dotsignif, segbar, normbar, ctrbar, ts, tsconf,
and scatdot.
call argument column name/number verification
(CARG-xxx):
When a call argument/parameter contains a column name or number
referencing the <statsDFrame> data.frame, the arguments
values are checked during the initialization of the micromapST
processing. The following errors may be detected and the function
stopped.
The call arguments/parameters that contain column names/numbers are:
rowNamesCol RNC sortVar SORT
The abbreviated name (<callArgAbbr>) listed above for the call argument/parameter (<callArg>) is included in the message to assist in identifying and correcting the problem.
CARG-<callArgAbbr> A column name in the
<callArg> call argument does not exist in the
<sDFName> data.frame: <value>
One or more of the column names in the <callArg>
call argument/parameter does not exist in the
<statsDFrame> data.frame. The bad value is displayed
in the <value> part of the message. Verify that the
column name exists in the <statsDFrame> data.frame
and re-run.
CARG-<callArgAbbr> The <callArg>
call argument is empty. Argument ignored.
The <callArg> value was found to have a length of zero
- empty. The argument can not be checked and is ignored.
CARG-<callArgName> The call parameter
contains more than one value. Only the first value will be used.
The call parameter <callArgName> can only use a single value.
More than one value was specified in the function call.
Only the first value will be used.
.
bordGrp and bordDir parameter (BGBx):
BGBD The directory specified in the bordDir call parameter is not a valid character string. Check the bordDir path name and make sure it is valid.
BGBD The directory specified in the bordDir argument/parameter does not exist. Value=<bordDir>
The path specified does not exist, Processing is stopped. Correct the path name and re-run. Check to make sure the slashes in the path name are correct and that the path exists.
BGBN The value provided as the bordGrp name is not character. Fix and rerun.
The bordGrp call parameter must be a character string of the border group name (and it's file.) If it is not the name of border group provided with the package, it must be the name of the border group file (without the .rda) that is located in the bordDir path.
BGBN When the bordDir parameter is set to NULL, the bordGrp must be one contain in the package. <backageName>
When the bordDir call parameter is not provied, or set to NULL or NA, the value of the bordGrp parameter must be one of the internal border groups provided with the package. The border groups provided with the package are:
U.S. States and the District of Columbia
The 18 U.S. Seer Registry areas
The counties of the state of Kansas
The counties of the state of Maryland
The counties of the state of New York
The counties of the state of Utah
The provinces, counties and cities of the United Kingdom and Ireland
The provinces, special administration areas, metropolitan cities of China
The districts of the South Korean city of Seoul
If you are using a custom or private border group, then the bordDir parameter must be provide to specify the file directory containing the border group .rda file.
See the appropriate documention section for details on each border group contained in this package.
BGBN The bordGrp parameter has not been specified and is required when the bordDir is provided.
When the bordDir call argument/parameter is specified, the package will load a users or private border group into the package. Therefore, the bordGrp call argument/parameter MUST be specified to identify the file in the directory to load. Verify the bordGrp filename is present and exists in the bordDir directory and re-run. Execution was stopped.
BGBN The bordGrp filename must have an ".rda" or ".RData" file extension.
When private border group files are created, they must be saved as ".rda" or ".RData" type "R" files. Please re-create your private border group file and re-run. Execution is stopped.
BGBN The bordGrp file in the bordDir directory does not exist.
The package attempted to open the bordGrp file in the bordDir directory, but the file does not exist. If the bordGrp filename was specified without a file extension, the package appended an ".rda" file extension. Verify the bordGrp filename exists in the bordDir directory and re-run. Execution is stopped..
Boundary datasets:
BGBG System error encountered when loading the border group.
See error message:
The package attempted to load the user/private border group.
The R system load() function failed and returned an error message.
The error message is logged following this message using the same
message indicator. Inspect the file and resolve the R system load() error
mesage and re-run. Execution was stopped.
The second like of error message details the loading error issued by R.
The areaNamesAbbrsIDs (Name Table) is missing from
the border group.
The border group must contain the name table (areaNamesAbbrsIDs)
to be useful. Determine why the name table is missing and retry.
The areaVisBorders boundary data set is missing
from the border group.
The border group must contain the boundary data for all of the areas
in the name table. In this case, the areaVisBorders data.frame containing
the area boundary point is missing. Determine why the table is
missing and retry.
The areaVisBorders boundary data set has a NULL value
from the border group.
The border group must contain the boundary data for all of the areas
in the name table. In this case, the areaVisBorders data.frame containing
the area boundary point has a NULL value.
Determine why the areaVisBorders data set is NULL, repair and retry.
The L3VisBorders boundary dataset is missing from the Border Group.
The Level 3 (map outline) boundary table is an optional boundary points
data.frame and is missing from the border group. The package will disable
drawing the L3 borders.
The L2Borders is set to be drawn, but no L2 border data is present.
L2 Feature disabled.
The L2VisBorders boundary point data.frame is missing from the border
group when the L2Feature is enabled. The L2Feature will be disabled.
If you require the L2 boundaries
determine why the data.frame is missing and retry.
The RegBorders is set to be drawn, border data is not present.
Reg Feature disabled.
The RegVisBorders boundary point data.frame is missing from the border
group when the RegFeature is enabled. The RegFeature will be disabled.
dataRegionsOnly options is also disabled. If you require the Reg boundaries
determine why the data.frame is missing and retry.
The L3VisBorders is NULL. Ploting the L3 outline is disabled.
The L3 boundary data found in the border group is empty. The drawing of the L3
outline boundary has been disabled. This may be a normal situation and only
needs attention if the L3 boundary data was expected to be in the border group.
Correct the problem and recreate the border group.
BGBN After loading <BorderGroupName> border group, the
following optional objects are missing: <list of data.frames>
After loading the border group specified in the function call
(<BorderGroupName>), the following optional objects (data.frames)
were missing in the border group: <list of data.frames>.
These are optional and any use of them has been disabled.
If you need one of these boundary data.sets, contact the border
group builder and determine why these data.frames were excluded.
The optional boundary data.frames are: L3VisBorders, L2VisBorders,
and RegVisBorders.
BGBN After loading <BorderGroupName> border group, the following
critical objects are missing: <list of data.frames>
After loading the border group specified in the function call
(<BorderGroupName>), the following critical objects (data.frames)
were missing in the border group: <list of data.frames>.
These boundary data.frames are required for the micromapST package
to run. The execution of the package has been stopped.
Contact the border group builder and determine why these data.frames
were excluded. The optional boundary data.frames are: areaVisBorders,
areaNamesAbbrsIDs, and areaParms.
L2Border was requested, but the Name Table does not have
a L2_ID column. Disable drawing of L2.
The border group indicates a Level 2 boundary should be drawn where
appropriate. However, the name table does not have the L2_ID column
to like the Level 2 boundaries to the basic areas. Correct the border
group's name table, rebuild the border group and try again.
Regional boundaries have been requested, but the
Name Table has no regID column. The feature is disabled.
The border group indicates the Regional boundary and the regional mapping
feature has been requested, but the Name Table has no regID column.
This feature is disabled.
rowNamesCol call argument (CARG-RNC):/cr To be able to link up the statsDFrame rows to areas in the border group and its boundary information, the statsDFrame must provide a column containing the sub-area names, abbreviations or IDs that match the name table entries in the border group. The values can be supplied via the row.names on the data frame or in a column in the data frame. The rowNamesCol call parameter allows the user to specify which statsDFrame column to use as the link to the border group name table. The rowNames argument/parameter is used to specify if the link is the name, abbreviation or ID of the sub-area.
CARG-RNC The row names in column rowNamesCol
of the statsDFrame data.frame contains duplicates. Only one row per area
is permitted. Duplicate rows are: <list of rows>. The rowNamesCol will be
ignored and the data.frame row.names values will be used.
The rowNamesCol value specifies a column with duplicate area names.
There can only be one row per area. The rowNamesCol call parameter is
ignored and the data.frames row.names information is attempted to be used.
CARG-RNC The rowNamesCol argument value must
have a length = 1. Only first value used.
The rowNamesCol value must have a length of 1 and should be a
simple vector. Only one column can be specified to contain the area
names (row names.) If more then one value is provided, only the
first value in the data will be used.
CARG The rowColNames is NA. The row.names values on
the data.frame will be used.
The rowNamesCol value provided is NA and cannot be used.
Instead the data.frame row.names information will be used.
CARG The rowColNames is NA. The row.names values on
the data.frame will be used.
The rowColNames value is NA. It must be a valid statsDFrame column
name. The data in the column must contact data that matches one of
the location IDs in the Name Table. When no rowColNames is specified,
the row.names values on the statsDFrame data.frame are used
as the location ID.
CARG The rowColNames value specified does not exist
in the data.frame. The row.names values on the data.frame will be used.
The data.frame column name specified in the rowColNames call
parameter does not exist in the data.frame. To continue executing,
the data.frames row.names information is used.
sortVar and ascend call arguments (CARG-AS):
The sortVar calling argument/parameter specifies the statsDFrame column numbers and/or names to be used to sort the data and the rows presented in the linked micromaps. The sorting is executed in the order of the columns listed in the numeric or character vector provided via the sortVar argument/parameter. The column numbers and names are verified before the sorting is attempted. All references must be valid. The default sort uses the sub-area names as spedified in the rowNames argument/parameter.
The ascend flag indicates which direction the sorting will be done. ascend = TRUE requests ascending order. ascend = FALSE requests descending order.
CARG-SV The sortVar parameter is not a numerical or character vector variable. Matrix, arrays, data.frames and tibbles are not supported. Will use the default of alpha sort on area names.
The sortVar call argument/parameter must be a numeric or character vector with a length less than with 3 values. The values must be valid indexes or names of columns in the statsDFrame data.frame supplied by the user. Only a simple vector is supported. Matrixes, arrays, data.frame and tibbles are not supported.
CARG-AS The ascend parameter is not a logical variable. Must be TRUE or FALSE.
The ascend call argument/parameter must be a logical vector with a length of 1. Vectors with multiple values are not allowed. Numeric, Integer or Character vectors are not supported.
rowNames call argument and alias name
matching feature (CARG-RN & ALIAS):
The rowNames call argument/parameter specifies what type of sub-area
name/string should be used to match the sub-area name in the
statsDFrame data frame, boxplot and time series data with the
border group's name table. The valid rowNames values are:
full, ab, alt_ab, id, and alias.
The default is ab.
When alias is specified, the package attempt to link the <statsDFrame> data rows to the boundary data using a partial match between the sub-area names in the <statsDFrame> data.frame and the alias column in the name table. This feature was implemented to handle special cases were the data source does not use the regularly used names or abbreviations for the sub-areas. See the discussion on the USSeerBG border group for more details and an example. At this time, only the USSeerBG border group makes use of this feature.
CARG-RN Invalid rowNames argument value. The value must be ab, alt_ab, id, alias, or full. The default of ab will be used.
The rowNames call argument/parameter contains an invalid value. The default of ab is used.
CARG-RN rowNames=alias is not supported for this bordGrp. The default of ab will be used.
The rowNames value of alias is only supported in border groups containing the an alias column in the name table and have the border group areaParm variable enableAlias set to TRUE. The only border group supplied with the package that supports the alias name matching is the the USSeerBG border group.
CARG-RN rowNames=seer is only supported for the USSeerBG bordGrp. The default of ab will be used.
Select the correct border group or use another rowNames value.
The rowNames parameter is NULL, NA or Missing.
The default of 'ab' will be used.
rowNames must be provided to define the type of location id to use.
The default is "ab". The other options are "full", "alias", and "alt_ab".
ALIAS Alias Names(s) in the data no not match the name table for the area. The unmatched data rows are: <listRowNames>
The rowNames value of alias was specified. During partial wildcard matching of the "alias" column in the border group name table with the data column in the statsDFrame data frame specified by the rowNamesCol call argument/parameter, one or more data rows did not successfully match the name table. The <listRowNames> part of the message lists the data row values that did not match. Review the statsDFrame values used for sub-area names and the documentation of the alias strings provided in the selected border group and make adjustments as required.
ALIAS Sub-area names in the data have duplicate name in rows: <listRowNames> Only one row per area is permitted.
There are multiple rows in the statsDFrame that match a single sub-area in the border group name table. Only the first matching data row will be used. The <listRowNames> section of the message contains a list of the duplicate rows. Correct row names to eliminate any duplicates and re-run.
The rowNames parameter is not a character string.
The rowNames parameter must be provided and must be a character vector. If
it is not, the function is terminated.
title call argument (CARG-TL):
The title calling argument/parameter is used to specify a one or
two line title for the linked micromap. If not title is provided,
the titles are left empty - blank.
CARG-TL The title argument contains more
than 2 items. Only the first two will be used.
The title call argument/parameter contains more that
2 character strings. Only two title lines are supported.
Only the first two values (lines) are used.
CARG-TL The typeof/class of the title parameter
is not character. Only character vectors are supported. The title
argument is ignored.
The type and class of the title call argument/parameter must be
a character vector. Numeric, integer and logical vectors are not supported.
CARG-TL The title argument/parameter is empty.
Recommend providing a title for the linked micromap.
The title call argument/parameter is empty (length of 0).
It's recommend a set of titles be specified for the linked micromap.
Please provide a title
character vector to help identify the linked micromap and re-run.
plotNames call argument (CARG-PL):
The plotNames calling argument/parameter specifies the type of area
name to be used in the ID glyph. The allowed values are ab and full.
CARG-PN Invalid plotNames argument value. The value
must be ab or full.
The value of the plotNames call argument/parameter is not valid,
The value must be ab or full. The default value of
ab is used. Check the spelling and re-run.
CARG-PN Invalid plotNames argument value. The value must
be 'ab' or 'full'. The default of 'ab' will be used.
The plotNames argument value must be 'ab' (abbreviation)
or 'full' (full name). The default of 'ab' will be used.
If the data contails full names, correct and re-run.
grpPattern call argument (CARG-GP):
The package creates a pattern of how many sub-area rows will be placed
in each glyph group row based on the total number of sub-area with data.
The user can override this generated pattern by providing their own
pattern in the form of an integer vector. Each entry represents one glyph
group row. The maximum value for an entry is 5. The values must be in
decending order toward the median sub-area and must sum to the number
of sub-areas in the data.
For example: grpPattern = c(5,5,4,3,4,5,5) for 31 sub-areas.
If these rules are not followed, the following
messages are generated and the package created pattern is used.
CARG-GP The grpPattern call parameter must be
an integer vector.
The vector provided must be an numeric or integer vector. The package
created pattern is used. Correct and re-run.
CARG-GP The total of the rows per group must equal
the number of rows in the <statsDFrame> data.frame.
grpPattern ignored.
The sum of the sub-area rows in the grpPattern vector is not equal to the number of sub-areas present in the statsDFrame data.frame. For example, if the <statsDFrame> data.frame has 51 rows after duplicates are removed, then the sum of the grpPattern vector must be 51. A grpPattern of c(5,5,5,5,5,1,5,5,5,5,5) would be valid, but c(4,4,4,4,4,1,4,4,4,4,4) would be invalid. The package will use a calculated pattern. Correct and re-run.
CARG-GP The grpPattern call parameter vector
must be <= 5 (rows per group). A value of <x> was found.
Each element in the grpPattern vector represents the number of sub-area rows per glyph group/row. The maximum number of sub-area per group is 5. The package will use a calculated pattern. Correct and re-run.
CARG-GP The grpPattern call parameter is not
properly ordered. The number of rows per group must be in desending
order to the median sub-area.
The number of rows per glyph group must be arranged in descending
order from the ends to the median sub-area row in the middle of the list.
For example: grgPattern = c(5,5,4,3,4,5,5).
A grpPattern of c(3,4,5,5,4,3) is acceptable.
The package will created and use a calculated pattern. Correct and re-run.
CARG-GP The one of the values in the grpPattern call
parameter is non-numeric or an NA. grpPattern ignored.
The grpPattern is a vector of integers specifying the number of areas in
each group/row or perceptional row. One of the values in the vector
is a non-numeric value or an NA. Correct and re-run.
axisScale call argument (axisScale & CARG-SC):
The axisScale calling argument/parameter is a single value character
vector that specifies the type of axis scaling the micromapST should
use on the glyphs. The acceptable values are:
original axis labeling provided by the pretty function. Scaling is done in the same way as previous releases of micromapST.
labeling is done using the extended algorithm from the labeling package.
labeling is done using the extended algorithm and label scaling by applying a multiplier of hundreds, thousands, etc and a subtitle used to identify the multiplier units.
Example: before: 1000 2000 3000 4000 after: in hundreds 1 2 3 4
labeling is done using the extended algorithm and each label is scaled by applying a multiplier of hundreds, thousands, etc. and a character identifying the scaling units postpended to the label.
Example: before: 0 5000 10000 15000 20000 after: 0 5K 10K 15K 20K
axisScale must be a character vector with a single item.
CARG-SC The axisScale argument is invalid, axisScale can only be set to o, e, s, or sn. The default value of e will be used.
The axisScale call argument/parameter does not contain a valid value. It must be o, e, s, or sn. The default of e will be used.
CARG-SC The axisScale argument is not a character vector of length 1.
The axisScale call argument/parameter must be a character vector containing only one value. Numerical, integer and logical vectors are not supported. Only simple vector structures are support. Correct the value assigned with axisScale and re-run.
CARG-AX The value for axisMethod internal variable is out of range 1-5 : <locAxisMethod>
The axisScale call argument/parameter is translated into the axisMethod variable internally. The value should be between 1 and 6. If out of range an internal programming problem has occured.
staggerLab call argument (CARG-SL):
CARG-SL The staggerLab argument is not a logical value. Setting staggerLab to FALSE.
The staggerLab parameter must be a TRUE or FALSE value. If it is not a logical variable, the call parameter is ignored and the default is used. ***check***
CARG-SL The maxAreasPerGrp argument is not a numeric value. Setting to the default of 5.
The maxAreasPerGrp argument must be a numeric value from 1 to 5. The default value is 5.
CARG-SL The maxAreasPerGrp call parameter is not 5 or 6. Value set to 5.
The maxAreasPerGrp argument must be either 5 or 6. At the current time only 5 is supported and will enforced.
colSize call argument (CARG-CS):
This feature is not implemented in the version. The argument name is reserved for future use and development.
CARG-CS The colSize parameter in pDName contains NA values in the colSize column: <BadValues>. Values must be numeric and > 0. The columns in the panelDesc data.frame (panelDesc) are not value width numbers. <BadValues> lists the values in the column. The width of the glyph column will be calculated. Correct if needed and re-run.
CARG-CS The colSize parameter in pDName has values for fixed width glyphs. Value(s): <BadValues>. Value(s) are ignored and set to NA. The colSize parameter in the panelDesc data.frame cannot be used to assign the column width on a glyph column that has already been assign a column width. The value can be NA, "", " " or 0. If any other value that is not numeric, it is invalid. The value is ignored. <BadValues> contains the list of values that will be ignored.
CARG-CS The colSize parameter in <pDName> does not contain numeric values : <BadValues> The <pDName> panelDesc data.frame has an a non-numeric value, character, logical, or a numeric with a value of "Inf" in the colSize parameter column. The <BadValues> list identifies the problem. The value is ignored. Correct and re-run.
CARG-CS The colSize entries in pDName are out of range ( <= 0 or > 200 ). Values are: <BadValues> The one or more values specified for colSize in the pDName data.frame are out of range. The colSize value will be ignored.
CARG-CS The reviewed colSize parameter in pDName has bad values (reported above) and have been replaced by the mean of the good values: <meanColSize>, Bad Values: <BadValues>
The "colSize" panelDesc parameter row as a bad value, see message <BadValues>.
CARG-CS The colSize parameter in pDName contains no useful information and will be ignored. The information provided in the colSize argument in the panelDesc data.frame has no usable values or information. The argument will be ignored.
Regional Features:
CARG-RB The regionsB call argument is not a logical variable.
The default of FALSE will be used.
The regionB call parameter must be TRUE or FALSE. FALSE indicates the
region boundaries should not be drawn, while TRUE indicate the should
be drawn as required.
ARG-DRO The dataRegionsOnly call argument is not a logical variable.
The default of FALSE will be used.
The dataRegionsOnly call parameter must be TRUE or FALSE. FALSE indicates the
all regions will be used and drawn. TRUE indicates only the regions with areas
with data in the statsDFrame data.frame will be drawn. Regions the do not have
areas with data will not be drawn. This is a means of including multiple regions
in a border group and only drawn sub-regions as required by the data.
colors call argument (COLORS):
COLORS An invalid single value is provided for the colors argument. It must be 'BW', 'greys', or 'grays'. The argument is ignored.
In the micromapST function call, a single value was provided for the colors parameter. The value must be BW, greys, or grays. If not, the parameter is ignored.
COLORS The colors vector has the incorrect number of items. It must have 1 or 24 items. <number items> provided.
In the micromapST function call, a single value or a vector 24 values can be supplied. If a different number of values are provided, the parameter is ignored and the default colors are used.
COLORS The colors vector type is invalid. It must be a character vector.
In the micromapST function call, the colors parameter contains non-character values. The colors contained in the vector must be character strings identifying the colors to be used for the 24 color parameters used by micromapST. Numerical values related to the current color palette are not acceptable.
details call argument (DETS):
The details call argument/parameter contains a named list of micromapST parameters values the user wants to override or change. Most of the parameters relate to minor functional variations in how the glyphs are drawn. Several other control the details on how the graphic page is layed out and should not be changed under normal situations. Changes of these variable can cause unpredictable results.
Future: The panelDesc data frame has been extended to support glyph parameter modifications on a per column per glyph basis and is the preferred way of working with the glyphs variables. Any changes made via the details parameter are global changes and will effect all of the glyphs.
DETS The <tag> does not have a valid value: <value> Check type <method> used.
The details call argument/parameter must be a named list structure and the type of each named variable on the list must match the master table. If it is not, details call parameter is ignored.
DETS The details parameter is not a list.
The details call argument/parameter must be a list structure. If it is not, details value is ignored.
DETS variable: <name> not found in master variable list. Name is not valid, skipped.
All of the variables used in the details call argument/parameter name list must be on the master detail variable list. See the documentation section on the micromapDefSets and the micromapSTSetDefaults for the variable description. If the variable is not present, it will be deleted and ignored.
DETS Zero length variable name found in the details list after the <lastVarName> variable.
This error frequently occurs when the details parameter list contain two commas and no name/value. The position of the two commas is preceded by the <lastVarName> variable in the list. The empty variable/list will be ignored, but please remove the extra comma for future runs.
DETS The Id.Dot.pch variable is can only be set to a value in the range from 1 to 25. Using the default of 22.
The IdDotpch variable is the symbol code for the character preceeding the labels in the id glyph. Only pch values of 1 to 25 are supported.. The variable is ignored and the default character value of 22 is used.
Over time each more details variable will be checked and additional error message will be provided.
*** What other details variable should be validated?
The panelDesc is being enhanced to allow appropriate variables to be defined on a glyph graphic column basis. As this feature becomes available the documentation will be updated.
panelDesc and glyphs data messages:
As micromapST setups to create the linked micromaps, each glyphs preforms additional validate on the panelDesc variables and the associated statsDFrame data.
The message identifier (4 alphanumeric) in the message is followed by the glyph name (type) and the panelDesc panel column number (1 through "N"). This helps the user quickly identify which glyphs column in the panelDesc definition is related to and therefore the data in the statsDFrame. The text of the message with specific information to pin point the problem.
For the map, mapcum, mapmedian, maptail and id glyphs there are no additional data from the user and no additional validation is required.
For the arrow, bar, dot, dotsignif, dotse, dotconf, ctrbar, normbar, segbar, and scatdot glyphs uses data in the statsDFrame data.frame and use the col1, col2, and col3 named lists for the panel column to identify the statsDFrame columns with the required data. The col1, col2, and col3 pointers are validated and the data in the statsDFrame data.frame is inspected based on the type of glyph requested. The messages use the 3rd digit to identify which panelDesc variable was validated. 1 = col1, 2 = col2, 3 = col3. The name of the panelDesc named list is also included in the message.
For the segmented stacked bar glyphs (CTRBAR, SEGBAR, NORMBAR), col1 and col2 specify the first and last statsDFrame column of the contiguous set of columns to use for the data for the segment values/sizes. The message related to the inspection of these columns use the 3rd digit of the message number to indicate the relative column number, 1 being the first column of data up to a maximum of 9 for the last column of data. The column name is also included in the messages to help.
The following is a summary of the usage of the col1, col2, and col3 variables by each glyph:
Variable Usage by glyph:
glyph | col1 | col2 | col3 | |
dot | = | dotvalue | - | - |
bar | = | bar length/value | - | - |
arrow | = | start value | end value | - |
dotsignif | = | dot value | p_value | - |
dotse | = | dot value | standard error | - |
scatdot | = | x coordinates | y coordinates | - |
ctrbar | = | first data column in set | last data column in set | - |
normbar | = | first data column in set | last data column in set | - |
segbar | = | first data column in set | last data column in set | - |
dotconf | = | dot value | lower confidence | upper confidence |
The col1, col2, and col3 variables are not used by the map, mapcum, maptail, mapmedian, id, boxplot, ts, and tsconf glyphs.
The 3rd character in the message identifier panelDesc data column (col1, col2, or col3) pointer related to the warning message. The glyph graphic column number associated with the warning is provided in the <cc> text of the message.
panelDesc parameter checks for panelDesc variables:
This top level checks for the values provided in the col1, col2, col3 variables are done by the internal function CheckColx and include:
If does not exist, create dummy vector to for variable
Verify type of vector is numeric, integer, or character. All other types not permitted.
Fill empty string items with "NA" where appropriate.
The rest of the checking is done in the glyphs based on which variables needed.
*** Note, the use of <PDvar> appears to be a duplication of the xx variable in the message, is this true? The following warnings relate to the panelDesc variables in general:
<gNameList> The length of the glyph type list is different the length of the variables list. The number of glyph graphs indicated in the panelDesc$Type column is different than the number of data columns for each glyph variable (col1, col2, etc.) Even if the data column is not used by the glyph, the variable columns all must be the same length as the panelDesc$Type column.
<gName> <pdColNum> The first column name/number (<stColName1>) must proceed the last column name/number (<stColName2>) in the <sDFName> data frame. *** More Information to be provided later. ***
<gName> <pdColNum> The first column name/number (<stColName1>) must proceed the last column name/number (<stColName2>) in the <sDFName> data frame. *** More Information to be provided later. ***
<gName> <pdColNum> The first column name/number (<stColName1>) must proceed the last column name/number (<stColName2>) in the <sDFName> data frame. *** More Information to be provided later. ***
<gName> <pdColNum> The number of segments is <value>. It must be between 2 and 9. If over 9, only the first 9 will be used. *** More Information to be provided later. ***
PDCOL <glyph> cc The column name/number of <PDColOrig> in <PDVar> does not exist in the statsDFname data.frame. The column name or number of <PDColOrig> in <PDVar> does not exist as a column name in the statsDFrame data.frame, is a negative or zero valued integer, is not an integer, or is an integer with a value greater then the number of columns in the statsDFrame data.frame.
It appears the "y" is the col"y" indicator and "cc" is the glyph column number. If so, the "y" is redundent.
The following warning deal with the column name or number specified in the panelDesc data.frame variables col1, col2, and col3:
<glyph> cc No <statsDFrame> column was specified in <PDVarName> in the <panelDesc> panelDesc data.frame. A data column name/number is required. <usage>
The <glyph> function for column <cc> requires a valid <statsDFrame> data.frame column be specified in the panelDesc <PDVarName> variable. The value is missing. Correct the <PDVarName> for column cc and re-run.
<glyph> cc The required panelDeac variable <pdVarName> is missing from the <pdDFname> data frame. <usage>
When using the "<glyph>" in glyph column "cc", requires the <pdVarName> variable to locate the data in the <statsDFrame> data frame. This is general related to the col1, col2 and col3 variable lists in the panelDesc data frame. Check the glyph descriptions and make sure all of the panelDesc variables required for the glyph are provided.
<glyph> cc The specified column name or number in <pdVarName> panelDesc variable (<stColName>) does not exist in <statsDFrame> data frame or is out of range. <usage>
The <glyph> determined the statsDFrame column name or number provided in the <pdVarName> does not exist or is out of range. If a column name was provided, it did not match any column names in the <statsDFrame> data frame. If a column number was provided, it has a value < 0 or greater than the number of columns in the <statsDFrame> data frame. Verify the column name or number in the panelDesc column variable and re-run.
<glyph> cc The data provided in the <stColName> column of the <statsDFrame> data frame contains one or more non-numeric entries. <usage>
For the glyph <glyph> in panel column cc, the data provided in the <stColName> are character vectors and one or more values contain non-numeric characters. This may result in an improper translation to numerics. The entries are treated as NA values. Inspect the data in column <stColName> and correct.
<glyph> cc The data provided in the <stColName> column in the <statsDFrame> data frame contains one or more entries that could not be converted to numeric values. <usage>
For the glyph <glyph> in column cc, an attempt was made to convert the character data in the <stColName> column of the <statsDFrame> to numeric values. One or more entries did not convert correctly resulting in missing values. Inspect the data in the <stColName> column of the <statsDFrame> data frame and correct.
<glyph> cc The <stColName> data column in <statsDFrame> data frame is not a character or numeric vector. <usage>
The <stColName> data column in the <statsDFrame> is not a character or numeric column. The glyph requires numerical values to generate the glyph. Inspect the data and correct. Logical. List and Matrix type data is not supported. If the column is character vectors, the character vectors must only contains numerical images and must be able to be converted to numerics to draw the glyph. Inspect the data column and correct.
<glyph> cc The <stColName> data column in <statsDFrame> data frame does not contain any numerical data. No rows will be drawn. <usage>
The <statsDFrame> data frame column <stColName> does not contain ANY numerical data to permit drawing of the requested glyph. Inspect the data column and re-run.
<glyph> cc The <pdVarName> data column in <statsDFrame> data frame contains all missing values (NA) and cannot be graphed. <usage>
In the <glyph> for column cc, the data column in the user provided statsDFrame pointed to by the <PDVarName> panelDesc variable contains a NA value. The glyph for that data row will not be drawn. If this is an error, correct the data in the statsDFrame data.frame and re-run. ***check***
<glyph> cc The row numbers with missing data are: <list of row names>
The <list of row names> indicates which rows in the statsDFrame
data.frame provided by the user is missing data values, are NA, or "".
Inspect the supplied data and correct.
In the <glyph> for column cc, the data column in the user provided statsDFrame pointed to by the <PDVarName> panelDesc variable contains a NA value. The glyph for that data row will not be drawn. If this is an error, correct the data in the statsDFrame data.frame and re-run. ***check***
<glyph> cc The <stColName> data column in <statsDFrame> data frame does not contain any data. Data vector has length of zero. <usage>
The <statsDFrame> data frame column <stColName> has a length of zero. If the <statsDFrame> is properly constructed with at least one sub-area, then this should not happen. If this error occurs, check the <statsDFrame> to make sure it has not been corrupted and has rows for each sub-area being graphed and mapped.
The following warning related to the data pointed to by the panelDesc col1, col2, and col3 variable in the statsDFrame data.frame. The "y" indicates which panelDesc col (1, 2, or 3) is being referenced.
<glyph> cc The <stColName> column in <statsDFrame> data frame contains one or more missing values. <usage>
*** More Information to be provided later. *** Description needed. **checked** arrow, bar, dot, dotsignif, dotconf, ***check***
The following warnings are specific to the dot significance glyph:
<glyph> cc One or more P_Value data entries in <stColName> are missing.
The data in the <statsDFrame> pointed to for use by the dotsignif glyph (col2) for the significance P Value is not in the range of 0 to 1 probability. Review the data and correct the values to a range of 0 to 1. This is only used by the DOTSIGNIF glyph.
<glyph> cc One or more P_Value data entries in <stColName> are out of the range.
The data in the <statsDFrame> pointed to for use by the dotsignif glyph (col2) for the significance P Value is not in the range of 0 to 1 probability. Review the data and correct the values to a range of 0 to 1. This is only used by the DOTSIGNIF glyph.
The following special case that do not reference a particular panelDesc column variable. These are generated by the stacked bar glyphs with regard to the <statsDFrame> data and the panelDesc col1 and col2 specifications.:
<glyph> cc The first column name/number (<stColName>) must proceed the last column name/number (stColName>) in the <statsDFrame> data frame.
In segbar, normbar and ctrbar stacked bar glyphs, the user provides the location of the first and last data column in the <statsDFrame> data frame. Each data column between the first and last columns are used to create the stacked bar glyph. The processing is done from lower column number to higher column number. Therefore, the first column must preceed the last column in the data frame. That is must have a lower column number or a column name that preceeds the last column name in the data frame.
<glyph> cc The number of segments is <numSegs>. It must be between 2 and 9. If over 9, only the first 9 will be used.
Stacked bar glyphs use 2 to 9 data columns from the statsDFrame data.frame to create segmented stacked bar graphs (CTRBAR, SEGBAR, NORBAR). Each data column represents the length of one segment and is validated for the glyph prior to use. If only 1 segment is specified, no stacked bar glyph is drawn. If more than 9 segments exists, only the first 9 will be used in the segbar, normbar and ctrbar glyph.
<glyph> cc The number of segments is <numSegs> At least 2 data columns must be defined.
****Duplicate of 020B Used by segbar, normbar, ctrbar glyphs. ***check*** *** More Information to be provided later. ***
<glyph> cc The data provided cannot be centered around the value of <center value>.
The ctrbar glyph is designed to center the stacked bars around the value of zero. The data provided is all higher than or lower than zero. A future feature is planned to allow the center value for the ctrbar glyphs to be specified. Until then the center value is set to 0 (zero). Used by ctrbar glyphs. ***check***
BoxPlot specific processing messages (boxplot data list):
BOXPLOT cc The panelData value of <pDataValue> in the <panelData> data frame does not exist or is not accessible.
The panelData variable in the <panelDesc> data frame is used to pass completed data structures to glyphs that cannot be contained in the <statsDFrame> data frame. In this case, the variable name provided <pDataValue> does not exist or is not accessible by the package. Make sure the name is spelled correctly and exists in the .GlobalEnv (calling) environment. This message is used by the BOXPLOT, TS and TSCONF glyphs. **checked** boxplot,
BOXPLOT cc The <pdDataName> data for the boxplot is not is list.
The data structure must be a list (one per area) of boxplot data lists. With the name of each area the attribute or name of each list entry.
BOXPLOT cc The <pdDataName> structure does not have any name attributes for the boxplot data.
No names, no way to link to the boundary data. Each list in this structed must be named using the area's name that matches the name table.
BOXPLOT cc The <pdDataName> boxplot data is not a valid structure. Must contain 6 boxplot sub lists.
The <pdDataName> structure must contain the 6 data lists for each area. Less than 6 or greater than 6 lists were found in the area's list structure. Correct the structure of the list and retry.
BOXPLOT cc The <pdDataName> boxplot data does not contain all of the lists of boxplot function output. Invalid structure.
The <pdDataName> structure must contain the following 6 names list for each area: stats, names, out, group, n, and conf. If these named lists are not found, the package considers the data structure invalid and will not attempt to draw the boxplot glyph. Verify the <pdDataName> is the correct variable and was properly created by the boxplot function.
BOXPLOT cc In the <PDDataName> boxplot data, the '$name' named list contains one or more missing values (NA).
The '$name' list is used to link the boxplot data to the boundary data. If the data cannot be match with a area, the areas glyph row will not be drawn. Check the boxplot data and ensure each individual boxplot has a valid sub-area name associated with it.
BOXPLOT cc There are duplicate sets of boxplot data in <boxnam> for the same area. Only the first one will be used.
The '$name' list in the boxplot data contain duplicates sub-area names. Only one boxplot can be drawn per area. The first set of data will be used to draw the boxplot. Any other data with the same name will be ignored. If a area does not have any boxplot data, that area's glyph will be omitted.
BOXPLOT cc The $stats matrix in the <boxnam> boxplot data does not have 5 values per area list is not the same length as the $name list.
Need description
BOXPLOT cc The $stats matrix in the <pcDataName> boxplot data must <bpNumNames> elements.
*** More Information to be provided later. ***
BOXPLOT cc The $stats matrix in the <pcDataName> boxplot data has missing values. Areas with missing values will not be drawn.
*** More Information to be provided later. ***
BOXPLOT cc The sub-area names/abbrevations in the <pdDataName> boxplot data $names values do not match the border group names: <nameList>.
*** More Information to be provided later. ***
BOXPLOT cc 02BE BOXPLOT cc There are one or more of rows in the <statsDFrame> that does not have matching boxplot data, (<pdDataName>) entries.
*** More Information to be provided later. ***
BOXPLOT cc 02BE BOXPLOT cc The missing sub-areas are: <AreaList>
*** More Information to be provided later. ***
For the time series (TS) and time series with confidence band (TSCONF), data for the graphics is provided through a separate data structure via the panelData column in the panelDesc data.frame. The statsDFrame data.frame is not used and the colx indexes are not used.
<glyphs> cc The panelData value of <pDataValue> in the <panelData> data frame does not exist or is not accessible.
The panelData variable in the <panelDesc> data frame is used to pass completed data structures to glyphs that cannot be contained in the <statsDFrame> data frame. In this case, the variable name provided <pDataValue> does not exist or is not accessible by the package. Make sure the name is spelled correctly and exists in the .GlobalEnv (calling) environment. This message is used by the BOXPLOT, TS and TSCONF glyphs.
The following warnings are issued during the processing of the sub-area identifiers in the <statsDFrame> data frame and the border group name table.
<glyph> cc There are no links in the <PDDataName> data to the following statsDFrame areas: <list of statsDFrame areas.
Sub-areas in the statsDFrame do not have data in <PDDataName>. *** More Information to be provided later. ***
<glyph> cc The following area links in <PDDataName> data do not link to <statsDFrame> areas and will not be used: <listOfNames>
Areas in the <PDDataName> do not have links to the <statsDFrame> data.frame. *** More Information to be provided later. ***
<glyph> cc There are no sub-area links provided in <PDDataName>. Cannot link data to the statsDFrame data.frame.
<PDDataName> does not have values to link to <statsDFrame> or the name table. *** More Information to be provided later. ***
<glyph> cc There are duplicate sub-areas in the <PDDataName> structure. Only the first match will be used.
<PDDataName> has duplicate entries. *** More Information to be provided later. ***
Only validate panelData entries for boxplot, ts and tsconf glyphs. For boxplot, datatype must be list. For ts and tsconf, datatype must be 3 dimensional array.
Implementation: statsDFrame and the sorted link table drives the glyph generation. Must do this to match up with the other columns. If no data is in <PDDataName> for the statsDFrame link, no glyphs is drawn for the row.
If duplicates exist in <PDDataName> only the first will be used.
If extra entries exist in <PDDataName> that don't match the statsDFrame, even if they match the name table, they are not used.
TS and TSConf specific processing:
TSxxxx-15 cc The variable name specified in the panelData row does not exist.
The name of the array containing the area information, the series and the point information (X, Y, HighY and LowY) values does not exist. The glyph will not be drawn. Check for a misspelled variable or incorrect name and retry.
TSxxxx-02 cc The data structured passed in the panelData field is not an array. Structure name = <dataNam>.
The data structure provided the additional information to construct the time series graphs MUST be a 3 dimension array. Any other type of structure is not acceptable.
TSxxxx-03 cc The time serial array's 1st dimension is not <numRows> areas to match statsDFrrame. It is <dimDArr[1]>.
The time series array does have enough 1st dimensional values to match the number of rows in the statsDFrame and the Name Table.
TSxxxx-04 cc The time serial array's 2nd dimension must have at least 2 values (time periods 2 to "n"). It is <dimDArr[2]>.
The time series must cover at least 2 or more time periods. So, the 2nd dimension must have a dimension of at least 2.
TSxxxx-05 cc The time series array's 3rd dimension is not 4 values for TSConf (X, Y, LowY and HighY). It is <dimDArr[3]>.
See note on message "02TA" below.
TSxxxx-06 cc The time series array's 3rd dimension must be 2 or 4. It is <dimDArr[3]>. (TS: X & Y, TSCONF: X, Y, LowY, HighY.)
See notes on message "02TA" below.
TSxxxx-10 cc rowNames on array do not match area ID list. The bad area IDs are: <List of Name Table Keys>.
In processing the time series array, several location IDs cannot be found in the Name Table. They are listed in the <List of Name Table Keys> string. Determine the reason for the incorrect values, correct and retry.
TSxxxx-12 cc The time series array\'s 3rd dimension must be at least 2. It is <number>.
The 1st dimension represents the geographical areas of the Time Service. The 2nd dimension represents the time of the observation (0 to "N"). The area's points are sorted by the X value provided in the 3rd dimension. The 3rd dimension of the array are is the observed value on the graph for the area at the time. For standard TS graphs this is x and y values on the chart. For a TS-conf graph, the 3rd dimension contains the x and y values and the confidence interval HighY and Low Y values. Correct the content of the 3rd dimension and retry.
TSxxxx-14 cc The time series array does not have rownames (location IDs) assigned to the 1st dimension. Data cannout be paired up with areas.
The row.names tied to the 1st dimension of the Time Series array are required to be able to link each time series with it geographic area. Make sure the values used for the 1st Dimension match the select type of location IDs in the Name Table.
panel layout calculations:
column sizing calculations:
PANEL Calculated column widths is less than minimum <colSizeMin> inches - too many columns specified.
*** More Information to be provided later. ***
PANEL Column width is to small to be useful, Package stopped.
*** More Information to be provided later. ***
row sizing calculations:
PANEL panelLayout - The calculated width of <calculated-width> is too large for the available space of <space-width>.
*** More Information to be provided later. ***
PANEL panelLayout - The calculated GrpRow Height is too small to be used.
*** More Information to be provided later. ***
PANEL panelLayout - The calculated GrpRow Height is <GrpRowHeight> inches. The minimum size limit is: <minimum row height>.
*** More Information to be provided later. ***
panel functions:
PANEL panelLengthen - invalid vector length. < 2
*** More Information to be provided later. ***
PANEL panelSelect - Dimension error. Program error - index 'i' or 'j' is out of bounds.
*** More Information to be provided later. ***
PANEL panelSelect - Bad label region name. Must be left, right, top or bottom.
*** More Information to be provided later. ***
Internal Package Messages:
DMP Error in axisMethod selection in Dot and DotSignif code.
*** More Information to be provided later. ***
INB is.between.r The r range value is not a vector with length of 2. FALSE returned.
Function is.between.r — Need description **checked** *** More Information to be provided later. ***
Informational:
Panel Messages:
PANEL Number of parameters overlaid = <numOverlaid.
*** More Information to be provided later. ***
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
The micromapGDefaults data.frame provides all of the detailed structure, colors, sizing, font sizes, separation distances, line weights and types, spacing, etc. required to physically construct the requested micromapST graphic in portrait or landscape modes from a letter size (8.5 x 11) up to a tabloid (11 x 17) page. The data.frame is mainly used internal to micromapST, but a copy can be obtained by a user when a large number of changes are required. This is not recommended. The primary purpose of this section is to provide a list and description of many of the variables in the details list that can be used to enable or disable functions of micromapST and its glyphs. These internal variable are identified by a "*" after the variable name. These are the only variables that generally safe to be modified by the user.
The data.frame contains two lists: colors and details.
The colors vector is the name of a color palette or a vector of 12 or 24 color names or values ("#xxxxxx" or name). The first twelve (12) colors are used to link the areas to the glyphs. The second 12 colors are used with the Time Series glyphs when transparent colors are required for the confidence band. The vector defines the 12 colors and their transparent equal are:
The 6 colors in each group for the states/areas and symbols in the glyphcs. One color per row (area). The 6th color is not used at this time.
1 color for the median state and glyphics and is generally black,
1 foreground color for highlighted states in the map. This is used to highlight states already referenced previously or have meaning depend on the type of map requested. The usage is as follows:
"map" - not used. "mapcum" - highlight states previously referenced above (a previous group/row). "maptail" - highlight states previously referenced above the median row and highlight remaining states not featured below the median row. "mapmedian" - highlight all states not featured above the median in maps above the median row and highlight all states not featured below the median in maps below the median row.
2 colors to represent non-featured areas above the median row and below the median row.
1 color to fill in non-referenced areas on the map. These are areas in the border group, but the user has not provided any data row in the statsDFrame supplied in the micromapST function call.
1 color to fill in non-active area. That is an area that in not referenced in the name table and can't be matched to any user data.
The additional 12 colors are the same colors defined above but modified 20% tranparency to provide a set of "transparent" colors for confidence graphs like the time series. This is done via the adjustcolors(colors,0.2) function. Only the first 6 of the transparency colors are used. The other 6 colors are reserved for future requipments.
If colors parameter can also be set to a single value to enable black and white color schemes. The acceptable values are; "greys" or "grays" or "bw". When specified, the entire plot will be done using the packages standard black/white/gray shades designed to support b&w duplication, color blindness and non-color publication.
Additional color palettes may be supported in future releases.
The package default 24 colors will be used:
5 state colors: "red", "orange", "green", "greenish blue", "lavender", "magenta",
1 median state color: "black",
1 highlighted area: "light yellow",
2 above and below median highlighted areas: light red, light blue,
1 color for non refernce (used) areas in the data,
1 color for non-active areas in the border group,
12 translucent colors using the above colors at 20%.
It is strongly recommended to use the default. When changing the colors list, then the entire list must be specified.
is a list structure that contains the internal variables and values used by micromapST to create the graphics structure layout and guide the operations of the micromapST function. The details internal variables provide a way to tune the look of the created link micromap and its glyphs. These internal variables are divided into two groups: General and Advanced. The general variable don't affect how the panels are constructed, but allow you to change the looks of the graphics: dot, shapes, colors, line weights, etc. The advanced variable affect the structure of the panels and how areas are presented. The following internal variable accessible through the details named list are grouped by general usage and their glyph types.
To change the values of items in the details list, only the variable(s) requiring change need to be specified as a list for the details=list() parameter in the call. In general, the beginning of the variable names indicates the glyph or glyph group the variable is associated, in most cases.
Arrow. -> arrow glyph Bar. -> bar glyph BoxP -> boxplot glyph CBar -> ctrbar glyph CSNBar -> ctrbar, segbar and normbar glyphs. Dot. -> dot, dotconf, and dotse glyphs Dot.conf. -> dotconf glyph Dot.Signf -> dot with significance overlay. Grid -> grid elements of all glyphs Id. -> id glyph Map. -> map glyphs Panel -> general glyph panel Rank -> area ranking glyphs. Ref -> Reference text and line SCD. -> scatter dot glyph SNBar -> segbar and normbar glyphs Title -> page and column labels and titles TS -> ts and tsconf glyph TSconf -> tsconf glyph
For example: to turn off the midpoint dot in the segmented bar glyphics, all that is required is:
details = list(SNBar.Middle.Dot=FALSE)
The following are the internal variables for the XAxis, Grid, Panels, Reference Line, and the Glyphs.
= 0.2 Size used for XAxis space between lines of labels
= 10 in 1000th of an inch. First and Last label indents from edge.
= 3.4 labels per inch in XAxis - initial objective.
= 0.75 (*100 for precentage). Percentage of the width of a space used to determine label overlaps.
= 0.2 inches. Extra column gap size required to support drawing an Y-Axis for TS and ScatDot glyphics
= "white" Grid line color
= 1, Grid line width
= "#676767FF", defaults to light gray
= "black", color of panel outlines
The following variable related to the reference text and line feature.
= "dashed", set reference value line to dashed
= 1.5, line width of reference line
= "midgreen", color of reference line when color is used.
= "black", color of reference line when grays are used.
= "black", color of reference line text when color is used.
= "black", color of reference line text when grays are used.
The following variables are used by the arrow glyph.
= 2.5, line width of arrow.
= 0.08, size of arrow ( not implemented )
= 0.08, length of arrow.
= 21, arrow-dot symbol 19-25.
= 0.9 cex, arrow-dot size.
= 0.5, line weight used on filled arrow-dot symbols.
= FALSE, include dot outline when filled. FALSE=NO, TRUE=YES
= "black", color used for arrow-dot outline.
= 0.5, line weight for arrow-dot outline.
The following variables are used by the bar glyph.
= 2/3, fraction of line height for bar. Should never be > .90. Usable range is 0.333 to 0.90
= "black", color of bar outline.
= 0.5, line width for bar outline
= "solid", line type for bar outline
The following variables are used by the boxplot glyph.
= 0.2, line width of box.
= 0.6, thick line width.
= FALSE, whether to outline the outlier points.
= "black", color of median box
= 0.80, line width of median line.
= 2, line width for median.
= 0.4, line width of outlier outlines.
= 0.7, size of outlier dots.
= "#4c4c4cFF" color of outliner lines when greys used.
The following variables are used by the dot, dotconf, dotsignif, and dotse glyphs.
= 21, solid circle (S compatible).
= 0.9 cex, size of dot.
= 0.5, linen weight for dot outline when 0:18 dot used.
=FALSE, whether to outline the dots.
= "black", color of dot outline.
= 0.5, line width of dot outline.
In addition to the variables listed above, the dotconf glyph also has the following variable.
= 21, solid circle (S compatible).
= 0.9 cex, size of dot.
= 0.5, linen weight for dot outline when 0:18 dot used.
=FALSE, whether to outline the dots.
= "black", color of dot outline.
= 0.5, line width of dot outline.
= 2, line width of confidence interval lines.
In addition to the variable defined above for the dot, dotconf, dotsignif and dotse glyphs, the dotse glyph also has the following variable define.
= 21, solid circle (S compatible).
= 0.9 cex, size of dot.
= 0.5, linen weight for dot outline when 0:18 dot used.
=FALSE, whether to outline the dots.
= "black", color of dot outline.
= 0.5, line width of dot outline.
= 95, percent confidence interval
= 2, line width of confidence interval lines.
In addition to the variable defined above for the dot, dotconf, dotsignif and dotse glyphs, the dotsignif glyph also has the following variable define.
= 4, overprint character "x" when not significance.
= 0.9*1.2 cex size of the overprint character.
= 0.5, linen weight for dot outline when 0:18 dot used.
= "black", color of overlaid symbol on DOT.
=FALSE, whether to outline the dots.
= "black", color of dot outline.
= 0.5, line width of dot outline.
= 0.05, p-value for testing significance.
= c(0,1), valid range for significant test data for p-value
The following variables are used by the id glyph.
= 1, fudge adjustment for ID Text.
/cr
= 0.9 inches, top panel 1st line id title placement above the first panel, used with lab1
= 0.1 inches, top panel 2nd line id title placement above the first panel, used with lab2
= 0.65, text side of ID column
= 22, pch symbol value to plot next to state name/abbrev.
= 1.5, size of dot symbol for state ID
= 0.8, size of solid dot symbol for state ID
= 0.1 inches. With of the ID symbol (box)
= 0.03125, width of a space in inches.
= 0.055, offset from left for start of ID column.
The following variables are used by all of the "map" type glyphs.
= 0.32, font size for state labels
= "#262626FF", (grey(0.88)) color of background (not active) sub-areas fill in maps
= "white", background of maps
= 0.3, line weight for map background boundaries
= "black", foreground color of maps
= 0.3, line weight for map foreground boundaries
= "lighter grey", color of Layer 2 outline in maps
= 0.35. line weight for Layer 2 boundaries
= "black", color of Layer 3 (national) outline in maps
= 0.4. line weight for Layer 3 (national) boundaries
= 0.09, width in inches of box symbols using in titles for maps
= 2.5 inches, maximum width of each map
= 1.5 inches, minimum width of each map
= "Median for Sorted Panel", text used in single row median panel instead of a map.
The following variable is used by the rank glyph.
= 0.25 inches - column fixed width
= 1, rank method - to be defined.
The following variables are used by the scatter dot glyph (scatdot).
= 0.52, font size for Y axis labels for scatter dots
= 21, type of point/symbol to be used for background data points (not active) - state's dots.
= "transparent", fill color for not selected state's dots.
= "black", border color used for non-active data points.
= 0.6, line width of outline of point/symbol used as non-active data points.
= 0.75, size of point/symbol used as non-active data points.
= 21, type of point/symbol for active data points.
= "black", border color for foreground dots.
= 0.6, Scatter dot symbol outline line weight for active data points.
= 1, size of point/symbol for active data points in scatter dots.
= 21, shape of filled symbol for median value - scatter dots.
= "black", border color for median symbol - scatter dots.
= 0.6, line width used on the median symbol - scatter dots.
= 1, symbol size median value - scatter dots.
= "black", color of filled symbol for median value - scatter dots.
= FALSE, whether or not to include horizontal grid lines in panel.
= 1.08, x range multiplier to keep dots from being clipped
= 1.12, y range multiplier to keep dots from being clipped
= TRUE, whether or not to include x=y sloped line
= colGrid, color of sloped line, default, grid line color. SCD.DiagLine must be TRUE.
= 1.25, line weight of sloped line, default, grid line color. SCD.DiagLine must be TRUE.
= "solid", line type of sloped line, default, grid line color. SCD.DiagLine must be TRUE.
micromapST support three types of stacked bar graphs: Centered, Segmented, and Normalized. The following section describes the internal variable used by all of these glyphs and then the unique variables used by each type.
The following variables are used by all of the horizontal stacked bar glyphs (ctrbar, segbar and normbar). Any variable for a specific type of stacked bar are listed following this section. The CSN at the beginning of the name of each variable indicates they are part of this group.
= 0.66667, fixed height of bar when variable height bars are is not used. Should never be greater than 0.90. Usable range is 0.333 to 0.90.
= 0.3333, height of first bar when variable height bars are used. Must be less than SBar.Last.barht and in the range of 0.333 to 0.6667, SNBar.varht or CBar.varht must be TRUE for this option of function.
= 0.80, height of last bar when variable height bars are used. Must be greater than CSNBar.First.barht and in the range of 0.6667 to 0.90. CSNBar.varht or CBar.varht must be TRUE for this option of function.
= "black", color of stacked bar outlines.
= 0.75, line weight for bar segment outline in segmented bar plots.
= "solid", line type for bar segment outline in segmented bar plots.
The following variables are only used by the ctrbar glyph.
= FALSE, enables variable height bars.
= FALSE, request two ended variable height bars be used
= FALSE, request a line is drawn at center point.
= 0, value of the center stacked bar glyph. def=0
= "white", centered bar zero vertical line color
= 1, line width for centered bar zero vertical line
= "dotted", type of centered bar zero line.
The following variables are used by the segbar and normbar glyphs:
= FALSE, enables variable height bars from
SBar.First.barht to
SBar.Last.barht.
= FALSE, request two ended variable height bars be used (small to large to small).( Not implemented )
=FALSE, request a dot be draw in at the mid point in the segmented bars.
=21, type of point/symbol used as the mid point dot. SNBar.Middle.Dot must be TRUE for this parameter to function.
="white", color of the point/symbol used as the mid point dot. SNBar.Middle.Dot must be TRUE for this parameter to function.
=0.3, size of point/symbol used
as the mid point dot.
SNBar.Middle.Dot must be TRUE for this
parameter to function.
=NA, line width of outline of
point/symbol used as the mid point dot.
SNBar.Middle.Dot must be TRUE for this parameter to function.
=NA, color of outline of point/symbol used as the mid point dot. SNBar.Middle.Dot must be TRUE for this parameter to function.
The following variables are used by the time series glyphs (ts and tsconf):
= 1.1, time series line weight
= 0.52, font size for Y axis labels
= FALSE, whether or not to include horizontal grid lines in panel
The advanced internal variable should not be modified unless you have
lot of time. If any are modify, the operation of the package may become
unpredicatable and can't be supported. However, through the use of
gentle changes and experimentation, you can modify the look of
the resulting micromapST output. While not recommended, the authors
felt access to these internal variable will help a user in some
unexpected situations.
These variables should not be modified unless absolutely necessary.
IF there area modified, the outcome can not be predicted and can't
be supported. Modify at your own risk:
= 1.2, top margin in inches
= 0.5, bottom margin in inches
= 0.75, bottom margin for legend
= 0.2, bottom margin difference
= 0.0, left margin in inches
= 0.2, left margin when Y axis labels and title are required
= 0.0, right margin in inches
= 0.5 inches. The border space between the page edges and the margins. This value is used for the top, bottom, left and right border spacing.
= c(0.75, 0.1, 0), Left Y axis margin line for axis labels and axis line.
= 0.67, y axis padding for integer plotting locations
= 0.34 inches, total panel padding (i.e., 0.17 at top and bottom of panel)
= 0.63 inches, spacing to keep reference line off panel edge
= 0.075 inches. Space between panel groups at or around the median panel. There are 7 units per panel. The average unit is between 1/10 and 1/8 inches.
= 0.5 inches. Minimum panel group height.
= 1.25 inches. Maximum panel group height.
= 1.65 units. Minor panel group height in units for single area panel.
= 7 units. Major panel group height for all panels, except single area panel.
= 0.75 inches. Glyphics column separator space.
= 0.75 inches. Minimum Glyph column size.
= 2.0 inches. Maximum Glyph column size.
= 1.27 inches, top panel 1st line placement above the first panel, used with lab1
= 0.64 inches, top panel 2nd line placement above the first panel, used with lab2
= 0.01 inches, top panel X-Axis line placement above the first panel
= 0.01 inches, bottom panel X-Axis line placement below the last panel
= 0.64 inches, bottom panel line placement below the last panel, used with lab3
= 1.27 inches, reference line legend below last panel, used with reftext
= 0.35 inches, Y axis label placement (to the left of panel), used with lab4
= 1.0, text size of title, used with title
The following variable is reserved for package testing only and should not be used.
= 0, disabled. Do not use.
The following variables are not implemented and reserved for future use.
( Not implemented )
= 0.25, minimum row size to col size ratio permited. (not implemented)
= 2, maximum row size to col size ratio permited. (not implemented)
= c(3.2,0.1,0), Top margin line for X axis labels and axis line.
= c(3.2,0.1,0), Bottom margin line for X axis labels and axis line.
= -0.35, Axis tick label placement adjustment
= 1.08, x axis scale expansion factor. Applied to the data range to calculate the graph\'s range.
= TRUE, enable staggered label feature - NOT USED
= 0.888889 - actually 0.6667 Size used for large XAxis labels
= 0.777778 - actually 0.5833 Size used for medium XAxis labels
= 0.666667 - actually 0.5 Size used for small X Axis labels For line labels - Normal = .75, space = .15 (20%), small space = .15 .5 = .075
= 0.0 inches. X Axis offset
= 0.33332 cex size of Y Axis labels
= 0.0 lines. Offset of Y Axis labels from panel edge.
= 5 labels. Number of labels per inche for Y Axis - initial goal.
= TRUE. Enable staggered labels on Y Axis - NOT USED.
/cr
= 0.75, size of reference line text
= 4.0, line width of arrow shadow to create outline ( not implemented )
= "black", arrow shadow color ( not implemented )
= 20, symbol for outlier - 19-25.
= "#262626FF", boxplot outline color
= 19, solid circle symbol.
= "white", color of median dot.
= 0.95, size of circles.
= 0.55, size of confidence interval
= 0.55, size of confidence interval
= "dark gray", color of outlines of ID symbols
= 0.8, line weight for outlines of ID symbols
= "solid", line type for map background boundaries
= "solid", line type for map foreground boundaries
= "solid", line type for Layer 2 boundaries
= "solid", line type for Layer 3 (national) boundaries
= "white", background color of map panels
= "lightest grey", color of sub-areas not referenced in maps
= FALSE. Position of last staggered label. FALSE = low, TRUE = high.
= 0.75 cex, general text size
The micromapGDefaults data.frame is built by the micromapGSetDefaults function when the micromapST package is called. Once built it cannot be changed. To change one or two (a few) variables, construct a list of these variables and pass it to micromapSEER via the details parameter in the call. To do large scale customization, call the micromapGSetDefaults function to get a copy of the entire data.frame and modify this copy. This is not recommended.
Daniel B. Carr, George Mason University, Fairfax VA, with contributions from Jim Pearson and Linda Pickle of StatNet Consulting, LLC, Gaithersburg, MD
micromapGSetDefaults, micromapGSetPanelDef
The micromapGSetDefaults function generates a data.frame containing two lists: the colors and details list. Each list contains the operational parameters to instruct micromapST on how to physically construct a micromapST graphic.
The colors list contains 24 rgb colors:
The 12 basic colors are:
for the state maps in each group of 6 areas that make up one panel row. Currently the max is 5 areas per group. The 6th color is reserved for future use.
for the median state (in the middle)
for the cumulative hightlight color for the areas already presented in other panel rows.
for the when highlighting areas above and below the median.
for use to shade not referenced areas in the user data.
for use to shade not-active areas.
and a repeat of the 12 rgb colors with an alpha transparency value of 10%.
The details list contains the spacing, margins, text size, etc. information to guide the construction of the micromapST graphic. These lists may be modified by the user, but this is not recommended. See the micromapSTDefault description for more information on these lists.
micromapGSetDefaults()
micromapGSetDefaults()
The default colors in the colors list are: red, orange, green, greenish-blue, lavender and magenta for the area colors, black for the median state color, and light yellow for the cummulative hightlighting, vary pale red and green for highlight areas above and below the median, the lightest and lighter gray to shade not referenced and not-active areas on the map. The default values in the details list can be found in the micromapGDefaults documentation.
This function is primarily used internally by micromapST. However, if the user wants to see values of all of the internal variables or wants to make wholesale changes to the layout and operational changes to the colors and details list, this function can be used to create a copy of the full colors and details lists.
The return value is a variable equal to list(colors, details) where both colors and details are named lists.
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
The micromapGSetPanelDef function generates a data.frame representing the layout of the columns and row groups in the total graphic. It include the location of each individual graphic, the spacing of the columns and row groups and spacing for the title column headers and trailers and x axis when needed. The resulting structure along with the values from micromapGSetDefaults are copies into the package's memory and used by the "panel functions" to manage the panels during the setup and glyphics creation.
micromapGSetPanelDef(nRows,rSizeMaj,rSizeMin,rSepGap,MaxRows,UGrpPattern)
micromapGSetPanelDef(nRows,rSizeMaj,rSizeMin,rSepGap,MaxRows,UGrpPattern)
nRows |
for the number of data rows to include in the map.) |
rSizeMaj |
the maximum size of a row group in inches. A row group represents a number of data rows as one horizontal set of graphs. |
rSizeMin |
the minimum size of a row group in inches. |
rSepGap |
the size of the spacing between row groups in the output graphic. |
MaxRows |
the maximum number of rows. |
UGrpPattern |
a vector of the number of rows per group. This is specified by the user as a substitute for the calculated row group pattern. The number of rows must equal the number of rows in the data that match the mapping data and must have the rows grouped so that the top and bottom groups have the most rows. The pattern can then slowly dimenish to 2 rows as the group approach the middle (median) row group. |
Minor modifications can be made to the presentation structure through the details= call parameter, but this is not recommended. See the micromapGDefault description for more information on these lists.
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
The micromapSEER function or the micromapST function with the bordGrp set to USSeerBG can be used to create linked micromaps for the 20 U. S. Seer Registries.
micromapSEER(statsDFrame,panelDesc,...)
micromapSEER(statsDFrame,panelDesc,...)
statsDFrame |
- data frame of the data required for creating the graphical glyphics with one row per Seer Area being mapped. Each row is linked to the boundary data using a Seer Area abbreviation or name. |
panelDesc |
- data frame containing information and pointers for each glyphic column to be generated. The data frame specifies the type of glyphic and the columns in the statsDFrame data frame that contain the data for the glyphic or the external data structure via the panelData list. |
... |
- the remaining parameters required and used by the micromapST function call. |
A release of micromapST was custom modified for the NCI Seer projects. Later the cusstom changes were integrated into the standard micromapST and the micromapSEER only version was discontinued. This micromapSEER shell call allows existing code to continue to work without modification.
None
Daniel B. Carr, George Mason University, Fairfax VA, and Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
Provides a easy and quick means of creating Linked Micromaps for any collection of geographically associated areas. The micromapST package uses the standard graphics and RColorBrewer packages to rapidly create highly readable linked micromap plots. This gives the user the ability to explore different views of their data quickly.
micromapST uses the border and name information contained in border group datasets to define the geographical areas used in creating the linked micromaps.
The micromapSEER function is included to help users of the specialized micromapST for NCI Seer areas called micromapSEER migrate to this package. The micromapSEER function, calls micromapST with the border group set to USSeerBG to generate linked micromaps for the 20 U. S. Seer Areas.
The micromapST contains border group datasets with the boundary and name information for the following::
USStatesBG - data from the original micromapST package for . the U. S. 50 states and the District of Columbia.
USSeerBG - data for the 21 U. S. Seer Areas.
KansasBG - data for the 105 counties in the state of Kansas.
NewYorkBG - data for the 62 counties in the state of New York.
MarylandBG - data for the 24 counties in the state of Maryland.
UtahBG = data for the 29 counties in the state of Utah
ChinaBG - 34 administrative areas in the country of China.
UKIrelandBG - 218 administrative areas in UK, Ireland and Isle of Man.
SeoulKoreaBG - 25 districts in Seoul S. Korea.
AfricaBG - 52 countries of the Africa continent.
Each plot row represents a single sub-area (state, county or province) within
the border group area. Each column can be defined to present a different
graphical representations of the user's data.
For linked micromaps, the primary columns are a MAP type, ID (sub-area name)
and one or more glyphics. The statistical data is presented in the
glyphics columns as one of the following glyph types:
arrows,
bars,
boxplots,
dots,
dots with confidence intervals,
dots with standard error,
dot with a significance marker,
time series line plots with or without confidence bands,
scatter plots, and
horizontal stacked (segmented, centered, and normalized) bars.
All border groups are distrbuted with the package as .rda datasets.
micromapST ( statsDFrame, panelDesc, rowNamesCol = NULL, rowNames = NULL, sortVar = NULL, ascend = TRUE, title = c("", ""), plotNames = NULL, axisScale = NULL, staggerLab = NULL, bordGrp = NULL, bordDir = NULL, dataRegionsOnly = NULL, regionsB = NULL, grpPattern = NULL, maxAreasPerGrp = NULL, ignoreNoMatches = FALSE, colors = NULL, details = NULL)
micromapST ( statsDFrame, panelDesc, rowNamesCol = NULL, rowNames = NULL, sortVar = NULL, ascend = TRUE, title = c("", ""), plotNames = NULL, axisScale = NULL, staggerLab = NULL, bordGrp = NULL, bordDir = NULL, dataRegionsOnly = NULL, regionsB = NULL, grpPattern = NULL, maxAreasPerGrp = NULL, ignoreNoMatches = FALSE, colors = NULL, details = NULL)
statsDFrame |
a data.frame containing data used with the following plots/glyphs:.
arrow, bar, segbar, normbar, ctrbar,
dot, dotse, dotconf, dotsignif,
and scatdot plots.
The data for the boxplot and time series plots (TS and TSConf)
is more complex and multi-dimensional and is passed to the glyph generation routines
via the panelData parameter (see below for more details.) |
panelDesc |
a data.frame that defines the description of each column: types, associated data columns in the stateFrame data.frame, column titles (top and bottom), reference values and text, and names of additional data.frames for complex glyphics (time series and boxplots). See section on panelDesc data.frame. for more details. |
rowNames |
defines the type of value used as the row.names in the stateFrame. The options are:
See the documentation on each border group for details on
the full names, abbreviatations, optional alternate abbreviations
and numeric identifiers defined for each area in the particular
border group in the areaNamesAbbrsIDs R object. If a border group does not have one of these sets of name data, then the corresponding option will not be available. The linkage between data and boundaries is accomplished using the strings in the
statsDFrame column identified by rowNamesCol or if no rowNamesCol is specified in
the The linkage values are validated to against the areaNamesAbbrsIDs data.frame for the specified border group for the micromapST call. If a sub-area in the border group is not referenced in the data, it is outlined on the map, but not colored. If the sub-area link cannot be matched, a warning message is generated and the execution of the package is stopped. Unless, the user has set the noMatchIgnore call argument to TRUE and the data will be ignored. The alias option is implemented to support with the USSeerBG border group and data generated by the Seer Stat program. The registry column identifying the Seer Area contains the Seer Area name and additional information. This option along with the alias field in the border group allows the package to use the registry column as the linkage to the boundary data using a wildcard or partial string allows the registry column generated by Seer Stat to be used as the linkage. This option is controlled the enableAlias variable in the border group's areaParms data.frame. |
rowNamesCol |
allows the user to specify the data.frame column that contains the sub-area string to be used to link the data row to the boundary data for the sub-area. The rowNames option above specifies which name information (full name, abbreviation, id, alternate abbreviations) the sub-area string is matched to in the border group name information. (see the border group documentation for more details.) The rowNamesCol value must be a column number or column name within the statsDFrame data.frame. The default value for rowNamesCol is to use row.names of the passed data.frame. |
sortVar |
defines the column name(s) or number(s) in the statsDFrame data.frame to be used to sort the statsDFrame data.frames before creating the state micromap. A vector of column names or numbers can be used sort on multiple columns and to break ties. For Example: |
ascend |
a logical value. If TRUE, sortVar will be sorted in ascending order. If FALSE, sortVar will be sorted in descending order. The default value is TRUE. |
title |
A character vector with one or two character strings to be used as the title of the overall micromap plot page. For example: \code{title = "micromapST Title"} or \code{title = c("title line 1","title line 2")} |
plotNames |
defines the type of state names to be displayed when an id glyph column
requested. The options are: ab or full.
ab will display the area abbreviations in uppercase as defined in the border
groups name information. |
axisScale |
defines the type of axis labels to be used and if scaling is applied. The acceptable values are o, w, s, and sn. o is the original method using the pretty function limiting the values to the range of the data. w is the default and uses the Wilkinson algorithm to generate the axis labels for a set of data, s uses the wilkinson algorithm, adjusts the range to cover the labels, and determines a scaling to apply to the labels. If 1000 is the scaling, "in the thousands" is added to the column titles. sn uses the wilkinson algorithm, adjusts the range to cover the labels, and checks each value to determine if it should be scaled. If it is scales, the scale identifier is added as a suffix. (e.g., 1240334 become 1.24M) "w" is the wilkinson algorithm without any scaling. |
staggerLab |
is a logical variable the controls if the axis labels are staggered and drawn on on two lines instead of one. If set to FALSE (the default), the labels are not staggered. If TRUE, two lines are used to draw the axis labels, alternating labels on each line. |
bordDir |
specifies the path to a border group dataset (.rda) that is external to the micromapST package. The name of the border group is specified in the bordGrp parameter. |
bordGrp |
specifies which preloaded border group to use for the area names, abbreviations, numeric identifier, and area boundary data. The supported border groups are USStatesBG and USSeerBG. The default bordGrp value is USStatesBG. For more information on building your own border group refer to the section in this manual on the BuildBorderGroup function. |
dataRegionsOnly |
specifies to only map regions containing data when aP_Regions is set to TRUE. This indicates the name table contains the information to draw partial area maps of regions. When set to TRUE, only regions within the area are drawn that contain at least one sub-area with data provided by the caller. If FALSE, the entire area is drawn. The default is FALSE. Retional boundaries are not required, but suggested. See border group documentation for more details. |
regionsB |
when regional boundaries are provided, controls whether to overlay the boundaries on the map. If dataRegionsOnly is set to TRUE and only a subset of regions will be mapped and regional boundaries are provided, regionsB is et to TRUE. The default is FALSE. Independent of dataRegionsOnly, if set to TRUE, if present, regional boundaries are overlayed on the map, If set to FALSE, no regional boundaries are drawn. See border group documentation for more details. |
grpPattern |
The micromapST package generates the pattern of rows to panel groups automatically. The pattern is based on a maximum of 5 rows per group and number of rows can only decend from the edges toward the median row. The pattern generated by the package can be overriden by using the grpPattern to specify the pattern to use as a numeric vector. The provided vector must pass the following checks: a) must be a numeric vector, b) the sum of the values in the vector must equal the number of rows in the user supplied statsDFrame data frame c) the maximum number of rows per panel group is 5. d) the number of rows per panel must be integers and decending from the outside to the middle of the vector. |
maxAreasPerGrp |
This parameter defines the maximum number of areas to be represented by a group-row in the resulting linked micromap. The value can be 5 or 6. |
ignoreNoMatches |
The micromapST package will automatically handle situations where there is no data for a sub-area in an area. However, it will stop processing if there is data for a non-existing area. To instruct the package to ignore rows of data for sub-areas that do not have boundary information in the border group, set the call argument ignoreNoMatches to TRUE to get the package to detail the data rows from the processing. |
colors |
is a vector containing a vector of 12 or 24 color names or values ("#xxxxxx") or the name of a color palette. The vector of 12 or 24 color names or "#xxxxxx" values are used to define the colors used for:
The only color palette support is a gray palette to permit publication of the linked micromaps using a gray scale instead of color. By setting colors = "greys", "grays", or "bw", the entire plot will be generated using gray scale that has been balanced to maintain readability and reproduction without the use of color printing. Additional color palettes may be supported in future releases. If a colors vector is not provided, the package default colors will be used:
It is strongly recommended to use the default. |
details |
defines the spacing, line widths and many other details of the plot layout, structure and content; see micromapGDefaults$details for more details. Generally details does not need to be specified, the default values will always be used and are strongly recommended. However, in a few cases, it may be desireable to turn off or disable a feature. In these cases, the user can specify just the specific variable and value in a list and pass it to micromapST via the details parameter. For example: details=list(SNBar.Middle.Dot=FALSE,SBar.varht=FALSE) The entire details variable list does not have to be passed. See the section on the micromapGDefaults$details for more details. |
The micromapST function creates a linked micromap plot for data referencing a collection of geographic related areas, like the 50 US States and DC geographical areas or U.S. Seer Areas. The function provides links from a US state map to several forms of graphical charts: dot (dot), dot with confidence intervals (dotconf), dot standard error (dotse), dot with significance mark (dotsignif), arrow (arrow), bar chart (bar), time series (ts), time series with a confidence band (tsconf), horizontal stacked (segmented) bar (segbar), normalized bar (normbar), centered bar charts (ctrbar), scattered dot (scatdot), and box plots (boxplot). The data values for each column of graphs and each area are provided in the statsDFrame data.frame. The panelDesc data.frame specifies the type of chart, the column numbers in the statsDFrame with the statistics for the chart, column titles, reference values, etc. Additional data for boxplots and time series plots are provided through the panelData data.frame column.
The following sets of data have been included in the package to support the examples and provide samples of data the micromapST package can utilize.
Dataset contained in the package to support the examples provides are:
U. S. State population data for 2010.
To be Added
To be Added
To be Added
White Female Lung (wflung) data from 1995 and 2000
Wflung US data for 2000 by county.
Wflung US data for 2000 and 1995.
Wflung data for 50s to 70s
Wflung data for 50s to 70s US rates
Kansas state population by county.
Maryland state population by county.
New York state population by county.
Utah state population by county.
Africa population by country.
Seoul City population by district.
China population by province and adminstrative district.
None
Daniel B. Carr, George Mason University, Fairfax VA, with contributions from Jim Pearson and Linda Pickle of StatNet Consulting, LLC, Gaithersburg, MD
Daniel B. Carr and Linda Williams Pickle, Visualizing Data Patterns with Micromaps, CRC
Press, 2010
Linda Williams Pickle, James B. Pearson Jr., Daniel B. Carr (2015), micromapST: Exploring and Communicating Geospatial Patterns in US State Data.,
Journal of Statistical Software, 63(3), 1-25., https://www.jstatsoft.org/v63/i03/
### # # micromapST - Example # 01 - map with no cumulative shading, # 2 columns of statistics: dot with 95% confidence interval, # boxplot sorted in descending order by state rates, using # the default border group of "USStatesBG", with default symbols. ### # load sample data, compute boxplot TDir<-"c:/projects/statnet/" # my private test PDF directory exist, don't use temp. if (!dir.exists(TDir)) {TDir <- paste0(tempdir(),"/") } # get a temp directory for the output # PDF files for the example. cat("TempDir:",TDir,"\n") # replace this directory name with the location if you want to same # the output from the examples. utils::data(wflung00and95,wflung00and95US,wflung00cnty,envir=environment()) wfboxlist = graphics::boxplot(split(wflung00cnty$rate,wflung00cnty$stabr), plot=FALSE) # set up 4 column page layout panelDesc01 <- data.frame( type=c("map","id","dotconf","boxplot"), lab1=c("","","State Rate","County Rates"), lab2=c("","","and 95% CI","(suppressed if 1-9 deaths)"), lab3=c("","","Deaths per 100,000","Deaths per 100,000"), col1=c(NA,NA,1,NA),col2=c(NA,NA,3,NA),col3=c(NA,NA,4,NA), refVals=c(NA,NA,NA,wflung00and95US[1,1]), refTexts=c(NA,NA,NA,"US Rate 2000-4"), panelData= c("","","","wfboxlist") ) panelDesc <- panelDesc01 # set up PDF output file, call package ExTitle <- c("Ex01-US White Female Lung Cancer Mortality, 2000-2004", "State Rates & County Boxplots") grDevices::pdf(file=paste0(TDir,"Ex01-US-WFLung-2000-2004-St-DotCf-Co-Box.pdf"), width=7.5,height=10) micromapST(wflung00and95, panelDesc01, sortVar=1, ascend=FALSE, title=ExTitle ) x <- grDevices::dev.off() # ### End Example 01 ### # # micromapST - Example # 02 - map with cumulative shading # from top down (mapcum), arrow and bar charts, # sorted in descending order by starting # value of arrows (1950-69 rates) using default # border group of "USStatesDF". This # example also provides custom colors for the # linked micromaps, highlights, etc. # ### # Load example data from package. utils::data(wmlung5070,wmlung5070US,envir=environment()) panelDesc02 <- data.frame( type=c("mapcum","id","arrow","bar"), lab1=c("","","Rates in","Percent Change"), lab2=c("","","1950-69 and 1970-94","1950-69 To 1970-94"), lab3=c("MAPCUM","","Deaths per 100,000","Percent"), col1=c(NA,NA,"RATEWM_50","PERCENT"), col2=c(NA,NA,"RATEWM_70",NA) ) colorsRgb = matrix(c( # the basic 7 colors. 213, 62, 79, #region 1: red #D53E4F - Rust Red 252, 141, 89, #region 2: orange #FC8D59 - Brn/Org 253, 225, 139, #region 3: green #FEE08B - Pale Brn 153, 213, 148, #region 4: greenish blue #99D594 - med Green 50, 136, 189, #region 5: lavendar #3288BD - Blue 255, 0, 255, #region 6 #FF00FF - Magenta .00, .00, .00, #region 7: black for median #000000 - Black 230, 245, 152, #non-highlighted foreground #E6F598 - YellowGreen 255, 174, 185, # alternate shape upper #FFAEB9 - Mauve 191, 239, 255, # alternate shape lower #BFEFFF - Cyan 242, 242, 242, # lightest grey for non-referenced sub-areas #F2F2F2 234, 234, 234), # lighter grey for bkg - non-active sub-areas. #EAEAEA ncol=3,byrow=TRUE) xcolors = c( grDevices::rgb(colorsRgb[,1],colorsRgb[,2],colorsRgb[,3], maxColorValue=255), # set solid colors grDevices::rgb(colorsRgb[,1],colorsRgb[,2],colorsRgb[,3],64, maxColorValue=255)) # set translucent colors for time series. # set up reference names for color set names(xcolors) =c("rustred","orange","lightbrown","mediumgreen", "blue","magenta", "black","yellowgreen", "mauve","cyan","lightest grey","lighter grey", "l_rustred","l_orange","vlightbrown","lightgreen", "lightblue","l_black","l_yelgreen","l_mauve", "l_cyan","l_lightest grey","l_lighter grey") ExTitle <- c("Ex02-US Change in White Male Lung Cancer Mortality Rates", "from 1950-69 to 1970-94-Diff colors") grDevices::pdf(file=paste0(TDir,"Ex02-US WmLung50-70-Arrow-Bar.pdf"),width=7.5,height=10) micromapST(wmlung5070,panelDesc02,sortVar=1,ascend=FALSE, title=ExTitle, colors=xcolors ) x <- grDevices::dev.off() # ### End Example 02 ## Not run: ### # # micromapST - Example # 03 - Time Series Line Plots with # Confidence Bands maptail option highlights states from extremes # to middle state read in time series data set example using the # default border group of "USStatesDF". # ### # Load example data from package. utils::data(TSdata,envir=environment()) temprates <- data.frame(TSdata[,,2]) # TSdata structure is array of size c(51,15,4), # dimensions = 51 states, 15 years, (year label, point value, low limit, # high limit) panelDesc03 <- data.frame( type=c("maptail","id","tsconf","dot"), lab1=c("","","Time Series","Female"), lab2=c("","","Annual Rate per 100,000","Most Recent Rate (2010)"), lab3=c("","","Years","Deaths per 100,000"), lab4=c("","","Rate",""), col1=c(NA,NA,NA,15), panelData =c(NA,NA,"TSdata",NA) ) ExTitle <- c("Ex03-US Time Series with Confidence bands", "Annual Female Lung Cancer Mortality Rates, 1996-2010") grDevices::pdf(file=paste0(TDir,"Ex03-US Time-Series-with-Conf.pdf"), width=7.5,height=10) micromapST(temprates,panelDesc03,sortVar="P15",ascend=FALSE, title=ExTitle) x <- grDevices::dev.off() # ### End Example 03 ## End(Not run) ### # # micromapST - Example # 03a - Time Series Line Plots with # Confidence Bands maptail option highlights states from extremes # to middle state read in time series data set example using the # default border group of "USStatesDF". # # Specify the x-Axis values are dates and to format them as dates. ### # Load example data from package. utils::data(TSdata,envir=environment()) temprates <- data.frame(TSdata[,,2]) # y rate # In the original package TS data, the x data was not # a date value, it was the year number. To be able to demonstrate # the X-Axis Date format labeling, these were changed to Date values # by effectively substracting 1970-1-1 from the year value. #### # # Example 3a - Building TS conf array and converting years # into date values for the X-Axis and labels. # # example of build a TS Conf array. # # Using the old TSdata array as a starting point and source of data, # but build an entirely new TS array structure in a similar manner # that might be used to build your own time series array. # data(TSdata) # get old array TSAreas <- row.names(TSdata) # one per area (index 1) NewArray <- array(dim=c(51,15,4),dimnames=list(TSAreas)) # this is for 51 states, 15 samples/observations, and 4 values per sample. for (inx in seq(1,length(TSAreas))) { # loop once per area Samp <- TSdata[inx,,] # samples for an area # each sample has 15 observations of 4 values. # value 1 is the X axis data or the DATE of the observation Samp[,1] <- as.Date(paste0(as.character(Samp[,1]),"-01-01")) # convert simple year number to date NewArray[inx,,] <- Samp } # setting the attribute "xIsDate" on array to TRUE, signals micromapST # the user wants to see the x-axis values as dates. ## Not run: attr(NewArray,"xIsDate") <- TRUE # TSdata and NewArray structures are arrays of size c(51,15,4), # dimensions = 51 states, 15 years, (year label, point value, low limit, high limit) panelDesc03a <- data.frame( type=c("maptail","id","tsconf","dot"), lab1=c("","","Time Series (YYYY-MM)","Female"), # recommend adding to the column title a note about the date format used. lab2=c("","","Annual Rate per 100,000","Most Recent Rate (2010)"), lab3=c("","","Years","Deaths per 100,000"), lab4=c("","","Rate",""), col1=c(NA,NA,NA,15), panelData =c(NA,NA,"NewArray",NA) ) ExTitle <- c("Ex03a-US Time Series with Confidence bands with time (yyyy-mm)", "Annual Female Lung Cancer Mortality Rates, 1996-2010") grDevices::pdf(file=paste0(TDir,"Ex03a-US Time-Series-with-Conf wDates.pdf"), width=7.5,height=10) micromapST(temprates,panelDesc03a,sortVar="P15",ascend=FALSE, axisScale="s", title=ExTitle) x <- grDevices::dev.off() ## End(Not run) # ### End Example 03a ### # # micromapST - Example 04 - dot followed by a scatter dot columns # use same data as Example 3 to compare 1996 & 2010 rates # mapmedian option shades states above or below the median # (light yellow) using the default border group of "USStatesBG" # # USES data loaded for Example 03 (temprates). # #### # Load example data from package. utils::data(TSdata,envir=environment()) temprates <- data.frame(TSdata[,,2]) # y rate panelDesc04 <- data.frame( type=c("mapmedian","id","dot","scatdot"), lab1=c("","","Female Lung Cancer Mortality","Comparison of Rates"), lab2=c("","","Rate in 1996 (Sort Variable)", "in 1996 (x axis) and 2010 (y axis)"), lab3=c("","","Deaths per 100,000","Deaths per 100,000 in 1996"), lab4=c("","","","Rate in 2010"), col1=c(NA,NA,1,1), col2=c(NA,NA,NA,15) ) ExTitle <- c("Ex04-US Dot Plot for 1996, Scatter Plot Comparing 1996 to 2010", "Female Lung Cancer Mortality Rates") FName <- paste0(TDir,"Ex04-US FLCMR Scatter-Dots-1996-2010.pdf") grDevices::pdf(file=FName,width=7.5,height=10) micromapST(temprates,panelDesc04,sortVar=1,ascend=FALSE,title=ExTitle) x <- grDevices::dev.off() # ### End Example 04 ### # # micromapST - Example 05 - horizontal stacked (segmented) bars # segbar plots the input data, normbar plots percent of total # package computes the percents from input data # input for the categories for each state must be in consecutive # columns of the input data.frame using the default border group # of "USStatesBG" #### # Load example data from package. utils::data(statePop2010,envir=environment()) panelDesc05 <- data.frame( type=c("map","id","segbar","normbar"), lab1=c("","","Stacked Bar","Normalized Stacked Bar"), lab2=c("","","Counts","Percent"), col1=c(NA,NA,"Hisp","Hisp"), col2=c(NA,NA,"OtherWBH","OtherWBH") ) ExTitle <- c("Ex05-Stkd Norm Bars: 2010 Census Pop by Race, Sorted by Cnt Other Race", "Cat-L to R: Hispanic, non-Hisp White, Black, Other-sn-varbar") grDevices::pdf(file=paste0(TDir,"Ex05-US Stkd-Norm Bar-var-height.pdf"), width=7.5,height=10) micromapST(statePop2010, panelDesc05, sortVar="OtherWBH", ascend=FALSE, title= ExTitle, details=list(SNBar.varht=TRUE), axisScale="sn" ) x <- grDevices::dev.off() # ### End Example 05 ## Not run: ### # # micromapST - Example 06 - horizontal stacked (segmented) bars # segbar plots the input data, normbar plots percent of total # package computes the percents from input data # input for the categories for each state must be in consecutive # columns of the input data.frame using the default border group # of "USStatesBG". # # Turning off the variable bar height and the midpoint dot features # in the horizontal stacked bars (segmented) # # USES data loaded for Example 05 above - statePop2010. # ### # Reuse data loaded for Example 5 above. panelDesc06= data.frame( type=c("map","id","segbar","normbar"), lab1=c("","","Stacked Bar","Normalized Stacked Bar"), lab2=c("","","Counts","Percent"), col1=c(NA,NA,"Hisp","Hisp"), col2=c(NA,NA,"OtherWBH","OtherWBH") ) ExTitle <- c("Ex06-Stacked Norm Bars: 2010 Census Pop by Race, Sorted by Other Race", "Cat-L to R: Hisp, non-Hisp White, Black, Other,ID-diamond") grDevices::pdf(file=paste0(TDir,"Ex06-Stkd-Norm-Bar-fixedheight-nodot.pdf"), width=7.5,height=10) micromapST(statePop2010,panelDesc06,sortVar=4,ascend=FALSE, title= ExTitle, details=list(SNBar.Middle.Dot=FALSE,SNBar.varht=FALSE,Id.Dot.pch=23) ) x <- grDevices::dev.off() # ### End Example 06 ## End(Not run) ### # # micromapST - Example 07 - centered (diverging) stacked bars # # National 8th grade Math Proficiency NAEP Test Scores Data for 2011 # source: National Center for Education Statistics, # http://nces.ed.gov/nationsreportcard/naepdata/ # bar segment values - % in each of 4 categories: # % < Basic, % at Basic, % Proficient, % Advanced # using the default border group of "USStatesBG". #### # Load example data from package. utils::data(Educ8thData,envir=environment()) # columns = State abbrev, State name, Avg Score, %s \<basic, # basic, proficient, advanced panelDesc07 <- data.frame( type=c("map","id","dot","ctrbar"), lab1=c("","","Avg. Scores","Math Proficiency"), lab2=c("","","","<Basic, Basic, Proficient, Advanced"), lab3=c("","","","% to Left of 0 | % to Right"), col1=c(NA,NA,"avgscore","PctBelowBasic"), col2=c(NA,NA,NA,"PctAdvanced") ) ExTitle <- c("Ex07-US Dot Stkd Bars:Educational Progress (NAEP) in Math-2011, 8th Grade", "Centered at Not-Prof vs. Prof") grDevices::pdf(file=paste0(TDir,"Ex07-US Dot-Centered-Bar Educ.pdf"),width=7.5,height=10) micromapST(Educ8thData,panelDesc07, sortVar=3, title=ExTitle) x <- grDevices::dev.off() # ### End of example 07 ## Not run: ### # # Example 08 - use of state.x77 data table as data source # Data does not contain a row for DC, a missing sub-area. # Example also uses a smaller then 7.5 x 10 graphic space. # ### utils::data(state,envir=environment()) stateData <- as.data.frame(state.x77) rownames(stateData) <- state.abb panelDesc08 <- data.frame(type = c("maptail", "id", "dot"), lab1 = c("", "", "Murder"), lab3 = c("", "", "Murders per 100K Population"), col1 = c(NA, NA, 5)) ExTitle <- c("Ex08-US LM Plot of Murders in the United States", "No DC row entry.") grDevices::pdf(file = paste0(TDir,"Ex08_US state.x77_no_DC.pdf"), width = 5, height = 9) micromapST(stateData, panelDesc08, sortVar = 5, ascend = FALSE, title = ExTitle) x <- grDevices::dev.off() # ### End Example 08 ## End(Not run) ### # # Example 09 - US state map based on data from state.x77 table with # DC row added to complete data.frame, but with missing values (NAs). # The DC row will be sorted to the bottom of the list size # it does not contain any data. # # Used data and the panelDesc data.frames (stateData and panelDesc16) # used in example 09. # ### panelDesc09 <- data.frame(type = c("maptail", "id", "dot"), lab1 = c("", "", "Murder"), lab3 = c("", "", "Murders per 100K Population"), col1 = c(NA, NA, 5)) # add DC as 51st state with missing data "NA" to stateData. rm(state) utils::data(state,envir=environment()) stateData <- as.data.frame(state.x77) rownames(stateData) <- state.abb stateData <- rbind(stateData, DC = rep(NA, 8)) # missing values for DC row. ExTitle <- c("Ex09-US-LM Plot of Murders in the United States", "DC row added with NA, decending.") grDevices::pdf(file =paste0(TDir,"Ex09_US_state.x77_DCasNA_D.pdf"), width = 5, height = 9) micromapST(stateData, panelDesc09, sortVar = 5, ascend = FALSE, title = ExTitle) x <- grDevices::dev.off() # ### End Example 09 ### # # Example # 10 - Maps Seer Registries using the micromapST function # with the bordGrp = "USSeerBG". # ### # Load example data from package. utils::data(Seer18Area,envir=environment()) # set up 4 column page layout panelDesc10 = data.frame( type=c("mapcum","id","dotsignif","arrow") ,lab1=c("","","Rate Trend APC", "Rate Change") ,lab2=c("","","Dot-Signif","2002-06 to 2007-11") ,lab3=c("","","","") ,col1=c(NA,NA,"RateTrendAPC","Rate20022006") ,col2=c(NA,NA,"pValue", "Rate20072011") ) ExTitle <- c("Ex10-SeerStat Data-2002-6 and 2007-11", "Dot with Signif., Arrow and Bar") grDevices::pdf(file=paste0(TDir,"Ex10-SeerStat-DotSignif.pdf"),width=7.5,height=10) micromapST(Seer18Area,panelDesc10, sortVar="Rate20022006",ascend=FALSE, title=ExTitle, rowNames="alias",rowNamesCol='Registry', bordGrp="USSeerBG", plotNames="ab") x <- grDevices::dev.off() # # Both calls are effectively identical. # ### End of example 10 ### # # Example # 11 - Counties in Kansas on an 11 x 17 page # ### # Load example data from package. utils::data(KansPopInc,envir=environment()) # Four Column Layout: Map, ID, Dot, and Dot panelDesc11 = data.frame( type=c("map","id","dot", "dot") ,lab1=c("", "", "Population", "Average Inc.") ,lab2=c("", "", "in 2000", "per year") ,lab3=c("", "", "People", "") ,col1=c(NA, NA, "Pop", "AvgInc") ) ExTitle <- c("Ex11-Kansas Pop data 11x17", "Current Pop and Average Inc - scaling=e") grDevices::pdf(file=paste0(TDir,"Ex11-Kansas_Population_and_Income-11x17.pdf"), width=10, height=16) # tabloid size page (11x17) to handle 105 counties. # Use default scaling = "e" and no staggered labels, # Use full county names for data to boundary matching, # but presented abbreviated county names # in "id" glyphic column with large page. micromapST(KansPopInc, panelDesc11, sortVar=c("AvgInc","Pop"), ascend=FALSE, title=ExTitle, rowNames="full", rowNamesCol='County', bordGrp="KansasBG", plotNames="ab") x <- grDevices::dev.off() # ### End Example 11 ### # # micromapST - Example 12 - A linked micromap of the counties of # the state of Maryland using the border group "MarylandBG". # The MarylandPopInc data is shown using two dot glypics - current # population and average increase per county. # A "maptail" state map is used to show the counties in relationship # to the median county as sorted by the 1970 population. ### utils::data(mdPopData,envir=environment()) # set up 5 column page layout panelDesc12 = data.frame( type=c("maptail","id","dot","dot","arrow") ,lab1=c("","","Population", "Population","Change") ,lab2=c("","","in 1970","in 2000", "from 1970 to 2000") ,lab3=c("","","","","") ,col1=c(NA,NA,"X1970","X2010","X1970") ,col2=c(NA,NA,"","","X2010") ) ExTitle <- c("Ex12-Maryland Population-map", "1970 and 2010 Pop and Change,stag,sn") grDevices::pdf(file=paste0(TDir,"Ex12-MD Pop 1970 and 2010 plus change-map.pdf"), width=7.5, height=10.5) micromapST(mdPopData, panelDesc12, sortVar=2, ascend=FALSE, title=ExTitle, rowNames="full", rowNamesCol='County', bordGrp="MarylandBG", axisScale="sn", staggerLab=TRUE, plotNames="ab") x <- grDevices::dev.off() # ### End Example 12 ### # # micromapST - Example 13 - A linked micromap of the counties # of the state of New York state using the border group # "NewYorkDF". The pop/inc data is shown using two dot glyphs, # an arrow and bar glyph (2010 Population, an arrow showing the # change in population from 2000 to 2010, Population in 2000, # and a bar showing the amount of the change.) # ### # Load example data from package. utils::data(nyPopData,envir=environment()) nyPopData$Dif00_10 <- nyPopData$Pop_2010 - nyPopData$Pop_2000 # set up 6 column page layout with colSize panelDesc13 <- data.frame( type=c("map","id","dot", "arrow", "dot", "bar") ,lab1=c("", "", "Population in", "Increase from","Pop 2005","Incre") ,lab2=c("", "", "2010", "2000", "", "2000to2010") ,lab3=c("", "", "", "", "", "") ,col1=c(NA, NA, "Pop_2010", "Pop_2000", "Pop_2000","Dif00_10") ,col2=c(NA, NA, "", "Pop_2010", "", "") ,colSize=c(NA,NA, 15, 20, 5, 20) ) ExTitle <- c("Ex13-New York Population data", "2010 Pop and since 2000-colSize,sn,stag") grDevices::pdf(file=paste0(TDir,"Ex13-New York Pop 2010 and Change-sn colSize.pdf"), width=7.5, height=10.5) micromapST(nyPopData, panelDesc13, sortVar="Pop_2000", ascend=FALSE, title=ExTitle, rowNames="full",rowNamesCol="Area", axisScale="sn", staggerLab=TRUE, bordGrp="NewYorkBG" ) x <- grDevices::dev.off() # #### End Example 13 ### # # micromapST - Example 14 - A linked micromap of the counties in the # state of Utah. The UtahPopData data is shown using two dot glypics # - current population and average increase per area. # ### # Load example data from package. utils::data(UtahPopData,envir=environment()) # # Get population differences from 2011 to 2001 and 1991. # Data contains ",". The comma's must be removed and values are # converted to numbers. # If data is factors, need to add "as.character()" function # to the formula below. UtahPopData2 <- as.data.frame(sapply(UtahPopData, function(x) gsub(",","",x)),stringsAsFactors=FALSE) # Calculate the differenct between 2001 and 2011 population. UtahPopData2$Del1101 <- as.numeric(UtahPopData2$X2011) - as.numeric(UtahPopData2$X2001) # Calculate the difference between 1991 and 2001 population. UtahPopData2$Del0191 <- as.numeric(UtahPopData2$X2001) - as.numeric(UtahPopData2$X1991) # set up 5 column page layout panelDesc14 = data.frame( type=c("map","id","dot","arrow","arrow") ,lab1=c("","","Population", "2001-2011","Chg 1991-2001") ,lab2=c("","","in 2011","pop change","pop change") ,lab3=c("","","","","") ,col1=c(NA,NA,"X2011","X2011","X2001") ,col2=c(NA,NA,NA,"X2001","X1991") ) ExTitle <- c("Ex14 - Utah county population 2011", " and changes last two decades,sn") grDevices::pdf(file=paste0(TDir,"Ex14-Utah Population.pdf"), width=7.5,height=10.5) micromapST(UtahPopData, panelDesc14, sortVar="X2011", ascend=FALSE, title=ExTitle, rowNames="full",rowNamesCol='County', axisScale="sn", bordGrp="UtahBG", plotNames="ab" ) x <- grDevices::dev.off() # ### End Example 14 ### # # micromapST - Example 15 - A linked micromap of the provinces, # municipalities, autonomous regions and special administrative # regions of China using the border group of "ChinaDF". # The ChinaPopInc data is shown using two dot glypics - current # population and average increase per area. # ### utils::data(cnPopData,envir=environment()) # set up 4 column page layout panelDesc15 = data.frame( type=c("map","id","dot","bar") ,lab1=c("","","Population", "Population") ,lab2=c("","","in 2013","in 2013") ,lab3=c("","","","") ,col1=c(NA,NA,"pop2013","pop2013") ) ExTitle <- c("Ex14-China Population", "in 2013 by area") grDevices::pdf(file=paste0(TDir,"Ex15-China 2013 Population.pdf"), width=7.5, height=10.5) micromapST(cnPopData, panelDesc15, sortVar="pop2013", ascend=FALSE, title=ExTitle, rowNames="full", rowNamesCol='area', bordGrp="ChinaBG", plotNames="full") x <- grDevices::dev.off() # ### End Example 15 ### # # micromapST - Example 16 - A linked micromap of the districts # of the city Seoul South Korea, using the border group of # "SeoulSKoreaBG". The included SeoulPopData dataset provides # population and district area statistics for 2012. # The micromapST generates two glyphics, a sorted dot # glyphic based on the population and a bar graph based on # the area. # ### # Load example data from package. utils::data(SeoulPopData,envir=environment()) # set up 4 column page layout panelDesc16 = data.frame( type=c("map","id","dot","bar") ,lab1=c("","","Population", "Area") ,lab2=c("","","in 2012","in 2012") ,lab3=c("","","","sqkm") ,col1=c(NA,NA,"Pop.2012","Area") ) ExTitle <- c("Ex16-Seoul Population", "in 2012 by district") grDevices::pdf(file=paste0(TDir,"Ex16-Seoul 2012 Population.pdf"), width=7.5, height=10.5) micromapST(SeoulPopData,panelDesc16, sortVar=3, ascend=FALSE, # sort based on the population title=ExTitle, rowNames="full", rowNamesCol='District', bordGrp="SeoulSKoreaBG", plotNames="full" ) x <- grDevices::dev.off() # ### End Example 16 ### # # Example 17 - use of Africa population data as data source # Demonstrates support for vertical oriented geographical # areas. # ### # Load example data from package. utils::data(AfricaPopData,envir=environment()) panelDesc17 <- data.frame( type = c("map", "id", "dot", "dot", "dot"), lab1 = c("", "","Population","Percentage Of","Est x2 Time"), lab3 = c("", "","People", "Total", "Years"), col1 = c(NA, NA,"Projection","PercOf", "Est2Time") ) ExTitle <- c("Ex17-Africa Population Data", "Sorted by Population on 11x17") grDevices::pdf(file = paste0(TDir,"Ex17-Africa Micromap-11x17.pdf"), width = 11, height = 17) micromapST(AfricaPopData, panelDesc17, sortVar = "Projection", ascend = TRUE, title = ExTitle, rowNames = "ab", rowNamesCol = "Abbr", bordGrp = "AfricaBG" ) x <- grDevices::dev.off() # ### End Example 17 ### unlink(TDir)
### # # micromapST - Example # 01 - map with no cumulative shading, # 2 columns of statistics: dot with 95% confidence interval, # boxplot sorted in descending order by state rates, using # the default border group of "USStatesBG", with default symbols. ### # load sample data, compute boxplot TDir<-"c:/projects/statnet/" # my private test PDF directory exist, don't use temp. if (!dir.exists(TDir)) {TDir <- paste0(tempdir(),"/") } # get a temp directory for the output # PDF files for the example. cat("TempDir:",TDir,"\n") # replace this directory name with the location if you want to same # the output from the examples. utils::data(wflung00and95,wflung00and95US,wflung00cnty,envir=environment()) wfboxlist = graphics::boxplot(split(wflung00cnty$rate,wflung00cnty$stabr), plot=FALSE) # set up 4 column page layout panelDesc01 <- data.frame( type=c("map","id","dotconf","boxplot"), lab1=c("","","State Rate","County Rates"), lab2=c("","","and 95% CI","(suppressed if 1-9 deaths)"), lab3=c("","","Deaths per 100,000","Deaths per 100,000"), col1=c(NA,NA,1,NA),col2=c(NA,NA,3,NA),col3=c(NA,NA,4,NA), refVals=c(NA,NA,NA,wflung00and95US[1,1]), refTexts=c(NA,NA,NA,"US Rate 2000-4"), panelData= c("","","","wfboxlist") ) panelDesc <- panelDesc01 # set up PDF output file, call package ExTitle <- c("Ex01-US White Female Lung Cancer Mortality, 2000-2004", "State Rates & County Boxplots") grDevices::pdf(file=paste0(TDir,"Ex01-US-WFLung-2000-2004-St-DotCf-Co-Box.pdf"), width=7.5,height=10) micromapST(wflung00and95, panelDesc01, sortVar=1, ascend=FALSE, title=ExTitle ) x <- grDevices::dev.off() # ### End Example 01 ### # # micromapST - Example # 02 - map with cumulative shading # from top down (mapcum), arrow and bar charts, # sorted in descending order by starting # value of arrows (1950-69 rates) using default # border group of "USStatesDF". This # example also provides custom colors for the # linked micromaps, highlights, etc. # ### # Load example data from package. utils::data(wmlung5070,wmlung5070US,envir=environment()) panelDesc02 <- data.frame( type=c("mapcum","id","arrow","bar"), lab1=c("","","Rates in","Percent Change"), lab2=c("","","1950-69 and 1970-94","1950-69 To 1970-94"), lab3=c("MAPCUM","","Deaths per 100,000","Percent"), col1=c(NA,NA,"RATEWM_50","PERCENT"), col2=c(NA,NA,"RATEWM_70",NA) ) colorsRgb = matrix(c( # the basic 7 colors. 213, 62, 79, #region 1: red #D53E4F - Rust Red 252, 141, 89, #region 2: orange #FC8D59 - Brn/Org 253, 225, 139, #region 3: green #FEE08B - Pale Brn 153, 213, 148, #region 4: greenish blue #99D594 - med Green 50, 136, 189, #region 5: lavendar #3288BD - Blue 255, 0, 255, #region 6 #FF00FF - Magenta .00, .00, .00, #region 7: black for median #000000 - Black 230, 245, 152, #non-highlighted foreground #E6F598 - YellowGreen 255, 174, 185, # alternate shape upper #FFAEB9 - Mauve 191, 239, 255, # alternate shape lower #BFEFFF - Cyan 242, 242, 242, # lightest grey for non-referenced sub-areas #F2F2F2 234, 234, 234), # lighter grey for bkg - non-active sub-areas. #EAEAEA ncol=3,byrow=TRUE) xcolors = c( grDevices::rgb(colorsRgb[,1],colorsRgb[,2],colorsRgb[,3], maxColorValue=255), # set solid colors grDevices::rgb(colorsRgb[,1],colorsRgb[,2],colorsRgb[,3],64, maxColorValue=255)) # set translucent colors for time series. # set up reference names for color set names(xcolors) =c("rustred","orange","lightbrown","mediumgreen", "blue","magenta", "black","yellowgreen", "mauve","cyan","lightest grey","lighter grey", "l_rustred","l_orange","vlightbrown","lightgreen", "lightblue","l_black","l_yelgreen","l_mauve", "l_cyan","l_lightest grey","l_lighter grey") ExTitle <- c("Ex02-US Change in White Male Lung Cancer Mortality Rates", "from 1950-69 to 1970-94-Diff colors") grDevices::pdf(file=paste0(TDir,"Ex02-US WmLung50-70-Arrow-Bar.pdf"),width=7.5,height=10) micromapST(wmlung5070,panelDesc02,sortVar=1,ascend=FALSE, title=ExTitle, colors=xcolors ) x <- grDevices::dev.off() # ### End Example 02 ## Not run: ### # # micromapST - Example # 03 - Time Series Line Plots with # Confidence Bands maptail option highlights states from extremes # to middle state read in time series data set example using the # default border group of "USStatesDF". # ### # Load example data from package. utils::data(TSdata,envir=environment()) temprates <- data.frame(TSdata[,,2]) # TSdata structure is array of size c(51,15,4), # dimensions = 51 states, 15 years, (year label, point value, low limit, # high limit) panelDesc03 <- data.frame( type=c("maptail","id","tsconf","dot"), lab1=c("","","Time Series","Female"), lab2=c("","","Annual Rate per 100,000","Most Recent Rate (2010)"), lab3=c("","","Years","Deaths per 100,000"), lab4=c("","","Rate",""), col1=c(NA,NA,NA,15), panelData =c(NA,NA,"TSdata",NA) ) ExTitle <- c("Ex03-US Time Series with Confidence bands", "Annual Female Lung Cancer Mortality Rates, 1996-2010") grDevices::pdf(file=paste0(TDir,"Ex03-US Time-Series-with-Conf.pdf"), width=7.5,height=10) micromapST(temprates,panelDesc03,sortVar="P15",ascend=FALSE, title=ExTitle) x <- grDevices::dev.off() # ### End Example 03 ## End(Not run) ### # # micromapST - Example # 03a - Time Series Line Plots with # Confidence Bands maptail option highlights states from extremes # to middle state read in time series data set example using the # default border group of "USStatesDF". # # Specify the x-Axis values are dates and to format them as dates. ### # Load example data from package. utils::data(TSdata,envir=environment()) temprates <- data.frame(TSdata[,,2]) # y rate # In the original package TS data, the x data was not # a date value, it was the year number. To be able to demonstrate # the X-Axis Date format labeling, these were changed to Date values # by effectively substracting 1970-1-1 from the year value. #### # # Example 3a - Building TS conf array and converting years # into date values for the X-Axis and labels. # # example of build a TS Conf array. # # Using the old TSdata array as a starting point and source of data, # but build an entirely new TS array structure in a similar manner # that might be used to build your own time series array. # data(TSdata) # get old array TSAreas <- row.names(TSdata) # one per area (index 1) NewArray <- array(dim=c(51,15,4),dimnames=list(TSAreas)) # this is for 51 states, 15 samples/observations, and 4 values per sample. for (inx in seq(1,length(TSAreas))) { # loop once per area Samp <- TSdata[inx,,] # samples for an area # each sample has 15 observations of 4 values. # value 1 is the X axis data or the DATE of the observation Samp[,1] <- as.Date(paste0(as.character(Samp[,1]),"-01-01")) # convert simple year number to date NewArray[inx,,] <- Samp } # setting the attribute "xIsDate" on array to TRUE, signals micromapST # the user wants to see the x-axis values as dates. ## Not run: attr(NewArray,"xIsDate") <- TRUE # TSdata and NewArray structures are arrays of size c(51,15,4), # dimensions = 51 states, 15 years, (year label, point value, low limit, high limit) panelDesc03a <- data.frame( type=c("maptail","id","tsconf","dot"), lab1=c("","","Time Series (YYYY-MM)","Female"), # recommend adding to the column title a note about the date format used. lab2=c("","","Annual Rate per 100,000","Most Recent Rate (2010)"), lab3=c("","","Years","Deaths per 100,000"), lab4=c("","","Rate",""), col1=c(NA,NA,NA,15), panelData =c(NA,NA,"NewArray",NA) ) ExTitle <- c("Ex03a-US Time Series with Confidence bands with time (yyyy-mm)", "Annual Female Lung Cancer Mortality Rates, 1996-2010") grDevices::pdf(file=paste0(TDir,"Ex03a-US Time-Series-with-Conf wDates.pdf"), width=7.5,height=10) micromapST(temprates,panelDesc03a,sortVar="P15",ascend=FALSE, axisScale="s", title=ExTitle) x <- grDevices::dev.off() ## End(Not run) # ### End Example 03a ### # # micromapST - Example 04 - dot followed by a scatter dot columns # use same data as Example 3 to compare 1996 & 2010 rates # mapmedian option shades states above or below the median # (light yellow) using the default border group of "USStatesBG" # # USES data loaded for Example 03 (temprates). # #### # Load example data from package. utils::data(TSdata,envir=environment()) temprates <- data.frame(TSdata[,,2]) # y rate panelDesc04 <- data.frame( type=c("mapmedian","id","dot","scatdot"), lab1=c("","","Female Lung Cancer Mortality","Comparison of Rates"), lab2=c("","","Rate in 1996 (Sort Variable)", "in 1996 (x axis) and 2010 (y axis)"), lab3=c("","","Deaths per 100,000","Deaths per 100,000 in 1996"), lab4=c("","","","Rate in 2010"), col1=c(NA,NA,1,1), col2=c(NA,NA,NA,15) ) ExTitle <- c("Ex04-US Dot Plot for 1996, Scatter Plot Comparing 1996 to 2010", "Female Lung Cancer Mortality Rates") FName <- paste0(TDir,"Ex04-US FLCMR Scatter-Dots-1996-2010.pdf") grDevices::pdf(file=FName,width=7.5,height=10) micromapST(temprates,panelDesc04,sortVar=1,ascend=FALSE,title=ExTitle) x <- grDevices::dev.off() # ### End Example 04 ### # # micromapST - Example 05 - horizontal stacked (segmented) bars # segbar plots the input data, normbar plots percent of total # package computes the percents from input data # input for the categories for each state must be in consecutive # columns of the input data.frame using the default border group # of "USStatesBG" #### # Load example data from package. utils::data(statePop2010,envir=environment()) panelDesc05 <- data.frame( type=c("map","id","segbar","normbar"), lab1=c("","","Stacked Bar","Normalized Stacked Bar"), lab2=c("","","Counts","Percent"), col1=c(NA,NA,"Hisp","Hisp"), col2=c(NA,NA,"OtherWBH","OtherWBH") ) ExTitle <- c("Ex05-Stkd Norm Bars: 2010 Census Pop by Race, Sorted by Cnt Other Race", "Cat-L to R: Hispanic, non-Hisp White, Black, Other-sn-varbar") grDevices::pdf(file=paste0(TDir,"Ex05-US Stkd-Norm Bar-var-height.pdf"), width=7.5,height=10) micromapST(statePop2010, panelDesc05, sortVar="OtherWBH", ascend=FALSE, title= ExTitle, details=list(SNBar.varht=TRUE), axisScale="sn" ) x <- grDevices::dev.off() # ### End Example 05 ## Not run: ### # # micromapST - Example 06 - horizontal stacked (segmented) bars # segbar plots the input data, normbar plots percent of total # package computes the percents from input data # input for the categories for each state must be in consecutive # columns of the input data.frame using the default border group # of "USStatesBG". # # Turning off the variable bar height and the midpoint dot features # in the horizontal stacked bars (segmented) # # USES data loaded for Example 05 above - statePop2010. # ### # Reuse data loaded for Example 5 above. panelDesc06= data.frame( type=c("map","id","segbar","normbar"), lab1=c("","","Stacked Bar","Normalized Stacked Bar"), lab2=c("","","Counts","Percent"), col1=c(NA,NA,"Hisp","Hisp"), col2=c(NA,NA,"OtherWBH","OtherWBH") ) ExTitle <- c("Ex06-Stacked Norm Bars: 2010 Census Pop by Race, Sorted by Other Race", "Cat-L to R: Hisp, non-Hisp White, Black, Other,ID-diamond") grDevices::pdf(file=paste0(TDir,"Ex06-Stkd-Norm-Bar-fixedheight-nodot.pdf"), width=7.5,height=10) micromapST(statePop2010,panelDesc06,sortVar=4,ascend=FALSE, title= ExTitle, details=list(SNBar.Middle.Dot=FALSE,SNBar.varht=FALSE,Id.Dot.pch=23) ) x <- grDevices::dev.off() # ### End Example 06 ## End(Not run) ### # # micromapST - Example 07 - centered (diverging) stacked bars # # National 8th grade Math Proficiency NAEP Test Scores Data for 2011 # source: National Center for Education Statistics, # http://nces.ed.gov/nationsreportcard/naepdata/ # bar segment values - % in each of 4 categories: # % < Basic, % at Basic, % Proficient, % Advanced # using the default border group of "USStatesBG". #### # Load example data from package. utils::data(Educ8thData,envir=environment()) # columns = State abbrev, State name, Avg Score, %s \<basic, # basic, proficient, advanced panelDesc07 <- data.frame( type=c("map","id","dot","ctrbar"), lab1=c("","","Avg. Scores","Math Proficiency"), lab2=c("","","","<Basic, Basic, Proficient, Advanced"), lab3=c("","","","% to Left of 0 | % to Right"), col1=c(NA,NA,"avgscore","PctBelowBasic"), col2=c(NA,NA,NA,"PctAdvanced") ) ExTitle <- c("Ex07-US Dot Stkd Bars:Educational Progress (NAEP) in Math-2011, 8th Grade", "Centered at Not-Prof vs. Prof") grDevices::pdf(file=paste0(TDir,"Ex07-US Dot-Centered-Bar Educ.pdf"),width=7.5,height=10) micromapST(Educ8thData,panelDesc07, sortVar=3, title=ExTitle) x <- grDevices::dev.off() # ### End of example 07 ## Not run: ### # # Example 08 - use of state.x77 data table as data source # Data does not contain a row for DC, a missing sub-area. # Example also uses a smaller then 7.5 x 10 graphic space. # ### utils::data(state,envir=environment()) stateData <- as.data.frame(state.x77) rownames(stateData) <- state.abb panelDesc08 <- data.frame(type = c("maptail", "id", "dot"), lab1 = c("", "", "Murder"), lab3 = c("", "", "Murders per 100K Population"), col1 = c(NA, NA, 5)) ExTitle <- c("Ex08-US LM Plot of Murders in the United States", "No DC row entry.") grDevices::pdf(file = paste0(TDir,"Ex08_US state.x77_no_DC.pdf"), width = 5, height = 9) micromapST(stateData, panelDesc08, sortVar = 5, ascend = FALSE, title = ExTitle) x <- grDevices::dev.off() # ### End Example 08 ## End(Not run) ### # # Example 09 - US state map based on data from state.x77 table with # DC row added to complete data.frame, but with missing values (NAs). # The DC row will be sorted to the bottom of the list size # it does not contain any data. # # Used data and the panelDesc data.frames (stateData and panelDesc16) # used in example 09. # ### panelDesc09 <- data.frame(type = c("maptail", "id", "dot"), lab1 = c("", "", "Murder"), lab3 = c("", "", "Murders per 100K Population"), col1 = c(NA, NA, 5)) # add DC as 51st state with missing data "NA" to stateData. rm(state) utils::data(state,envir=environment()) stateData <- as.data.frame(state.x77) rownames(stateData) <- state.abb stateData <- rbind(stateData, DC = rep(NA, 8)) # missing values for DC row. ExTitle <- c("Ex09-US-LM Plot of Murders in the United States", "DC row added with NA, decending.") grDevices::pdf(file =paste0(TDir,"Ex09_US_state.x77_DCasNA_D.pdf"), width = 5, height = 9) micromapST(stateData, panelDesc09, sortVar = 5, ascend = FALSE, title = ExTitle) x <- grDevices::dev.off() # ### End Example 09 ### # # Example # 10 - Maps Seer Registries using the micromapST function # with the bordGrp = "USSeerBG". # ### # Load example data from package. utils::data(Seer18Area,envir=environment()) # set up 4 column page layout panelDesc10 = data.frame( type=c("mapcum","id","dotsignif","arrow") ,lab1=c("","","Rate Trend APC", "Rate Change") ,lab2=c("","","Dot-Signif","2002-06 to 2007-11") ,lab3=c("","","","") ,col1=c(NA,NA,"RateTrendAPC","Rate20022006") ,col2=c(NA,NA,"pValue", "Rate20072011") ) ExTitle <- c("Ex10-SeerStat Data-2002-6 and 2007-11", "Dot with Signif., Arrow and Bar") grDevices::pdf(file=paste0(TDir,"Ex10-SeerStat-DotSignif.pdf"),width=7.5,height=10) micromapST(Seer18Area,panelDesc10, sortVar="Rate20022006",ascend=FALSE, title=ExTitle, rowNames="alias",rowNamesCol='Registry', bordGrp="USSeerBG", plotNames="ab") x <- grDevices::dev.off() # # Both calls are effectively identical. # ### End of example 10 ### # # Example # 11 - Counties in Kansas on an 11 x 17 page # ### # Load example data from package. utils::data(KansPopInc,envir=environment()) # Four Column Layout: Map, ID, Dot, and Dot panelDesc11 = data.frame( type=c("map","id","dot", "dot") ,lab1=c("", "", "Population", "Average Inc.") ,lab2=c("", "", "in 2000", "per year") ,lab3=c("", "", "People", "") ,col1=c(NA, NA, "Pop", "AvgInc") ) ExTitle <- c("Ex11-Kansas Pop data 11x17", "Current Pop and Average Inc - scaling=e") grDevices::pdf(file=paste0(TDir,"Ex11-Kansas_Population_and_Income-11x17.pdf"), width=10, height=16) # tabloid size page (11x17) to handle 105 counties. # Use default scaling = "e" and no staggered labels, # Use full county names for data to boundary matching, # but presented abbreviated county names # in "id" glyphic column with large page. micromapST(KansPopInc, panelDesc11, sortVar=c("AvgInc","Pop"), ascend=FALSE, title=ExTitle, rowNames="full", rowNamesCol='County', bordGrp="KansasBG", plotNames="ab") x <- grDevices::dev.off() # ### End Example 11 ### # # micromapST - Example 12 - A linked micromap of the counties of # the state of Maryland using the border group "MarylandBG". # The MarylandPopInc data is shown using two dot glypics - current # population and average increase per county. # A "maptail" state map is used to show the counties in relationship # to the median county as sorted by the 1970 population. ### utils::data(mdPopData,envir=environment()) # set up 5 column page layout panelDesc12 = data.frame( type=c("maptail","id","dot","dot","arrow") ,lab1=c("","","Population", "Population","Change") ,lab2=c("","","in 1970","in 2000", "from 1970 to 2000") ,lab3=c("","","","","") ,col1=c(NA,NA,"X1970","X2010","X1970") ,col2=c(NA,NA,"","","X2010") ) ExTitle <- c("Ex12-Maryland Population-map", "1970 and 2010 Pop and Change,stag,sn") grDevices::pdf(file=paste0(TDir,"Ex12-MD Pop 1970 and 2010 plus change-map.pdf"), width=7.5, height=10.5) micromapST(mdPopData, panelDesc12, sortVar=2, ascend=FALSE, title=ExTitle, rowNames="full", rowNamesCol='County', bordGrp="MarylandBG", axisScale="sn", staggerLab=TRUE, plotNames="ab") x <- grDevices::dev.off() # ### End Example 12 ### # # micromapST - Example 13 - A linked micromap of the counties # of the state of New York state using the border group # "NewYorkDF". The pop/inc data is shown using two dot glyphs, # an arrow and bar glyph (2010 Population, an arrow showing the # change in population from 2000 to 2010, Population in 2000, # and a bar showing the amount of the change.) # ### # Load example data from package. utils::data(nyPopData,envir=environment()) nyPopData$Dif00_10 <- nyPopData$Pop_2010 - nyPopData$Pop_2000 # set up 6 column page layout with colSize panelDesc13 <- data.frame( type=c("map","id","dot", "arrow", "dot", "bar") ,lab1=c("", "", "Population in", "Increase from","Pop 2005","Incre") ,lab2=c("", "", "2010", "2000", "", "2000to2010") ,lab3=c("", "", "", "", "", "") ,col1=c(NA, NA, "Pop_2010", "Pop_2000", "Pop_2000","Dif00_10") ,col2=c(NA, NA, "", "Pop_2010", "", "") ,colSize=c(NA,NA, 15, 20, 5, 20) ) ExTitle <- c("Ex13-New York Population data", "2010 Pop and since 2000-colSize,sn,stag") grDevices::pdf(file=paste0(TDir,"Ex13-New York Pop 2010 and Change-sn colSize.pdf"), width=7.5, height=10.5) micromapST(nyPopData, panelDesc13, sortVar="Pop_2000", ascend=FALSE, title=ExTitle, rowNames="full",rowNamesCol="Area", axisScale="sn", staggerLab=TRUE, bordGrp="NewYorkBG" ) x <- grDevices::dev.off() # #### End Example 13 ### # # micromapST - Example 14 - A linked micromap of the counties in the # state of Utah. The UtahPopData data is shown using two dot glypics # - current population and average increase per area. # ### # Load example data from package. utils::data(UtahPopData,envir=environment()) # # Get population differences from 2011 to 2001 and 1991. # Data contains ",". The comma's must be removed and values are # converted to numbers. # If data is factors, need to add "as.character()" function # to the formula below. UtahPopData2 <- as.data.frame(sapply(UtahPopData, function(x) gsub(",","",x)),stringsAsFactors=FALSE) # Calculate the differenct between 2001 and 2011 population. UtahPopData2$Del1101 <- as.numeric(UtahPopData2$X2011) - as.numeric(UtahPopData2$X2001) # Calculate the difference between 1991 and 2001 population. UtahPopData2$Del0191 <- as.numeric(UtahPopData2$X2001) - as.numeric(UtahPopData2$X1991) # set up 5 column page layout panelDesc14 = data.frame( type=c("map","id","dot","arrow","arrow") ,lab1=c("","","Population", "2001-2011","Chg 1991-2001") ,lab2=c("","","in 2011","pop change","pop change") ,lab3=c("","","","","") ,col1=c(NA,NA,"X2011","X2011","X2001") ,col2=c(NA,NA,NA,"X2001","X1991") ) ExTitle <- c("Ex14 - Utah county population 2011", " and changes last two decades,sn") grDevices::pdf(file=paste0(TDir,"Ex14-Utah Population.pdf"), width=7.5,height=10.5) micromapST(UtahPopData, panelDesc14, sortVar="X2011", ascend=FALSE, title=ExTitle, rowNames="full",rowNamesCol='County', axisScale="sn", bordGrp="UtahBG", plotNames="ab" ) x <- grDevices::dev.off() # ### End Example 14 ### # # micromapST - Example 15 - A linked micromap of the provinces, # municipalities, autonomous regions and special administrative # regions of China using the border group of "ChinaDF". # The ChinaPopInc data is shown using two dot glypics - current # population and average increase per area. # ### utils::data(cnPopData,envir=environment()) # set up 4 column page layout panelDesc15 = data.frame( type=c("map","id","dot","bar") ,lab1=c("","","Population", "Population") ,lab2=c("","","in 2013","in 2013") ,lab3=c("","","","") ,col1=c(NA,NA,"pop2013","pop2013") ) ExTitle <- c("Ex14-China Population", "in 2013 by area") grDevices::pdf(file=paste0(TDir,"Ex15-China 2013 Population.pdf"), width=7.5, height=10.5) micromapST(cnPopData, panelDesc15, sortVar="pop2013", ascend=FALSE, title=ExTitle, rowNames="full", rowNamesCol='area', bordGrp="ChinaBG", plotNames="full") x <- grDevices::dev.off() # ### End Example 15 ### # # micromapST - Example 16 - A linked micromap of the districts # of the city Seoul South Korea, using the border group of # "SeoulSKoreaBG". The included SeoulPopData dataset provides # population and district area statistics for 2012. # The micromapST generates two glyphics, a sorted dot # glyphic based on the population and a bar graph based on # the area. # ### # Load example data from package. utils::data(SeoulPopData,envir=environment()) # set up 4 column page layout panelDesc16 = data.frame( type=c("map","id","dot","bar") ,lab1=c("","","Population", "Area") ,lab2=c("","","in 2012","in 2012") ,lab3=c("","","","sqkm") ,col1=c(NA,NA,"Pop.2012","Area") ) ExTitle <- c("Ex16-Seoul Population", "in 2012 by district") grDevices::pdf(file=paste0(TDir,"Ex16-Seoul 2012 Population.pdf"), width=7.5, height=10.5) micromapST(SeoulPopData,panelDesc16, sortVar=3, ascend=FALSE, # sort based on the population title=ExTitle, rowNames="full", rowNamesCol='District', bordGrp="SeoulSKoreaBG", plotNames="full" ) x <- grDevices::dev.off() # ### End Example 16 ### # # Example 17 - use of Africa population data as data source # Demonstrates support for vertical oriented geographical # areas. # ### # Load example data from package. utils::data(AfricaPopData,envir=environment()) panelDesc17 <- data.frame( type = c("map", "id", "dot", "dot", "dot"), lab1 = c("", "","Population","Percentage Of","Est x2 Time"), lab3 = c("", "","People", "Total", "Years"), col1 = c(NA, NA,"Projection","PercOf", "Est2Time") ) ExTitle <- c("Ex17-Africa Population Data", "Sorted by Population on 11x17") grDevices::pdf(file = paste0(TDir,"Ex17-Africa Micromap-11x17.pdf"), width = 11, height = 17) micromapST(AfricaPopData, panelDesc17, sortVar = "Projection", ascend = TRUE, title = ExTitle, rowNames = "ab", rowNamesCol = "Abbr", bordGrp = "AfricaBG" ) x <- grDevices::dev.off() # ### End Example 17 ### unlink(TDir)
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The NewYorkBG border group dataset supports creating linked micromaps for the 62 counties in the state of New York. When the bordGrp call argument is set to NewYorkBG, the appropriate name table (county names and abbreviations) and the 62 sub-areas (countries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the county's name, abbreviated or ID based on the table below.
data(NewYorkBG)
data(NewYorkBG)
The NewYorkBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, numerical identifier and alias matching string for each of the 62 counties in New York
- the boundary point lists for each area.
- the boundaries for an intermediate level. For this border group, this boundary data.frame is not used and set to L3VisBorders as a place holder.
- the boundaries for an intermediate level. For this border group, this boundary data.frame is not used and set to L3VisBorders as a place holder.
- the boundary of the state of New York.
The New York county border group contains 62 county sub-areas. Each county has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. No regions are defined in the New York county border group, so the L2VisBorders dataset is not used and the regions option is disabled. The L3VisBorders dataset contains the outline of the state of New York.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (county) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (county) in the name table, the sub-area (county) is mapped but not colored.
The following are a list of the names, abbreviations, and IDs for each country in the NewYorkBG border group.
name | ab | id |
Albany | ALBA | 36001 |
Allegany | ALLE | 36003 |
Bronx | BRON | 36005 |
Broome | BROO | 36007 |
Cattaraugus | CATT | 36009 |
Cayuga | CAYU | 36011 |
Chautauqua | CHAU | 36013 |
Chemung | CHEM | 36015 |
Chenango | CHEN | 36017 |
Clinton | CLIN | 36019 |
Columbia | COLU | 36021 |
Cortland | CORT | 36023 |
Delaware | DELA | 36025 |
Dutchess | DUTC | 36027 |
Erie | ERIE | 36029 |
Essex | ESSE | 36031 |
Franklin | FRAN | 36033 |
Fulton | FULT | 36035 |
Genesee | GENE | 36037 |
Greene | GREE | 36039 |
Hamilton | HAMI | 36041 |
Herkimer | HERK | 36043 |
Jefferson | JEFF | 36045 |
Kings | KING | 36047 |
Lewis | LEWI | 36049 |
Livingston | LIVI | 36051 |
Madison | MADI | 36053 |
Monroe | MONR | 36055 |
Montgomery | MONT | 36057 |
Nassau | NASS | 36059 |
New York | NEWY | 36061 |
Niagara | NIAG | 36063 |
Oneida | ONEI | 36065 |
Onondaga | ONON | 36067 |
Ontario | ONTA | 36069 |
Orange | ORAN | 36071 |
Orleans | ORLE | 36073 |
Oswego | OSWE | 36075 |
Otsego | OTSE | 36077 |
Putnam | PUTN | 36079 |
Queens | QUEE | 36081 |
Rensselaer | RENS | 36083 |
Richmond | RICH | 36085 |
Rockland | ROCK | 36087 |
St. Lawrence | STLA | 36089 |
Saratoga | SARA | 36091 |
Schenectady | SCHE | 36093 |
Schoharie | SCHO | 36095 |
Schuyler | SCHU | 36097 |
Seneca | SENE | 36099 |
Steuben | STEU | 36101 |
Suffolk | SUFF | 36103 |
Sullivan | SULL | 36105 |
Tioga | TIOG | 36107 |
Tompkins | TOMP | 36109 |
Ulster | ULST | 36111 |
Warren | WARR | 36113 |
Washington | WASH | 36115 |
Wayne | WAYN | 36117 |
Westchester | WEST | 36119 |
Wyoming | WYOM | 36121 |
Yates | YATE | 36123 |
There are no alternate abbreviations or regions assocated with counties in New York border group.
The id field value is the U. S. state and county FIPS code.
The rowNames = "alias" or "alt_ab", the regionB and dataRegionsOnly features are not supported in the NewYorkBG border group.
This dataset contains the 2014 population and average income per person in each of the New York counties.
data(nyPopData)
data(nyPopData)
A data frame with 63 rows, 1 for each county and 14 variables per county:
a character vector containing the New York county and buorgh name.
a numeric vector of the county's estimated population for 2000.
a numeric vector of the county's estimated population for 2001.
a numeric vector of the county's estimated population for 2002.
a numeric vector of the county's estimated population for 2003.
a numeric vector of the county's estimated population for 2004.
a numeric vector of the county's estimated population for 2005.
a numeric vector of the county's estimated population for 2006.
a numeric vector of the county's estimated population for 2007.
a numeric vector of the county's estimated population for 2008.
a numeric vector of the county's estimated population for 2009.
a numeric vector of the county's estimated population for 2010.
a numeric vector of the county's actual population for 2000.
a numeric vector of the county's actual population for 2010.
This dataset was pulled from the New York government website in January, 2015. It contains the actual and estimate populations of each county from 2000 to 2010.
The panelDesc data.frame provides the micromapST function with
the information required to process the statsDFrame data and panelData data.frames
and to generate the required linked micromap plot.
It specifies which columns in the statsDFrame data.frame contain the data
for each glyph column, the column types, labels, reference values and text, and when
more complex data is needed by a glyph (boxplot and time series) what the name of the data
structure..
Example panelDesc = data.frame( type=c("mapcum","id","dotconf","dotconf"), lab1=c("","","White Males","White Females"), lab2=c("","","Rate and 95% CI","Rate and 95% CI"), lab3=c("","","Deaths per 100,000","Deaths per 100,000"), col1=c(NA,NA,"Rate",9), col2=c(NA,NA,4,11), col3=c(NA,NA,5,12), colSize=c(NA,NA,5,5), refVals=c(NA,NA,NA,wflungUS[,1]), refTexts=c(NA,NA,NA,"US Rate"), panelData=c("","","","")
The panelDesc data.frame (which does not have to be named "panelDesc", any name will do) provides the means of defining how many columns to create, the type of glyph per column, where the data required by the glyph is located in the statsDFrame (column number or name) or the name of a supplemental data structure when the glyph is boxplots or time series (via the panelData list entry), the column titles, and the column's reference value and label for the link micromap generation.
In the following description the term "AREA" represents the geographic unit being mapped and associated with data in the statsDFrame. The naming used must match the border group specified. If the border group of "USStatesDF" is used, the areas are U.S. States and DC and 51 data rows must be present. If the border group of "USSeerDF" is used, the areas are U.S. Seer areas as defined by NCI and the number of data rows can be 9, 11, 13, 17 or 18. In all cases, the abbreviations and names defined in the border group dataset must be used in preparing the statsDFrame and panelData structures.
Glyph Types
The type vector defines the type of glyph to be used for each column. The available glyphs are:
"map", "mapcum","maptail","mapmedian"
"id"
"rank"
"dot", "dotse","dotconf", "dotsignf", "bar", "arrow", "ts", "tsconf","scatdot", "segbar", "normbar", "ctrbar", "boxplot"
The following provides a description of each panel type:
- US map with active areas colored
- US map with active areas colored and previously active area highlighted generating an accumulation from top to bottom
- US map with active areas colored and previously active area highlighted until the median area, then the reverse to the end (areas that have not been active are highlighted.)
- US map with active areas colored. Maps above the median area have areas with values above the median highlighted. Maps below the median area have areas with values below the median highlighted. This helps define the above and below median area groups.
- generates a column with a colored identifier (a square) and the area or area name or abbreviation.
- number the area in rank order, sequentially.
- an arrow between two values with a head.
- a single bar chart.
- a boxplot per area with box, upper and lower whiskers and outliers.
- a dot for a single value.
- a dot for a single value and its standard error.
- a dot for a single value and its confidence interval.
- a dot for a single value with an indicator of its significants.
- a time series line for up to 30 sets "x" and "y" values for each area. The TS glyph can have X-Axis labels formated as numbers or dates.
- a time series line for a up to 30 sets of "x", "y" and upper "y" and lower "y" values as a confidence interval band for each area. The TSConf glyph can have X-Axis labels formated as numbers or dates.
- a horizontal stacked (segmented) bar plot starting at 0 for 2 to 9 bars.
- a stacked bar plot where the data is normalized for each area by dividing the bar segment values by the sum of the values for all of the bars. Up to 9 bars are supported.
- a stacked bar plot where the bar segments are centered around the 0.Up to 9 bars are supported.
- a set of points for each area with an "x" and "y" value.
Labels (Column Headers and Footers)
micromapST supports up to 3 column labels or titles: lab1, lab2 and lab3, where lab1 and lab2 are header titles for the column. lab3 is the footer title for the column. All titles are optional. lab3 is used to indicate the unit of measure at the bottom of the columns, but is not limited to this use. For example:
lab1=c("Col1-Title", "Col2=Title", "Col3-Title" ) # 1st title for columns lab2=c("Col1-Sub", "Col2-Sub", "Col3-Sub" ) # 2nd title for columns lab3=c("Col1-Footer","Col2-Footer","Col3-Footer") # Footer title for columns
lab4 is used only when time series or scatter dot glyphs are used to provide a Y axis title for the column. All label-title vectors are optional and only required when an title or label is needed.
Data References
Depending on the type of glyph selected for the column, 1 to 3 data values for each area may be required: The col1, col2 and col3 vectors serve as indexes to columns in the statsDFrame data.frame passed in the arguments of the micromapST function call. The values can be either the numeric number of the row in statsDFrame data.frame or the column name. If no index is required, the entry should be set to NA.
If the glyph requires one value, then only the col1 index is used and the col2 and col3 indexes are set to NA if present . If 2 values are required, then col1 and col2 indexes are used and the col3 index is set to NA, if present. If 3 values are required, then col1, col2, and col3 indexes are used.
The statsDFrame column indexes can be provided as an integer or the column name. If the integer value is less than 1 or greater than the number of columns in statsDFrame or a column name is used that does not exist in statsDFrame, the micromapST function will stop and generate an error message.
Glyph | Meaning | col1 | col2 | col3 | panelData |
Name | |||||
arrow | Arrow | Beginning | Ending Values | NA | NA |
Values | (arrow head) | ||||
bar | Horizontal | Bar end | NA | NA | NA |
bar | values | ||||
(length) | |||||
segbar | Horizontal | Values for | Values for | NA | NA |
stacked | first (left | the last | |||
bar | -most) segment | (right-most) | |||
(length) | bar segment | ||||
(length) | |||||
normbar | Horizontal | Values for | Values for | NA | NA |
stacked | first (left- | last (right- | |||
bar, nor- | most) bar | most,bar | |||
malized to | segment | segment | |||
total 100% | (length) | (length) | |||
ctrbar | Horizontal | Values for | Values for | NA | NA |
stacked | first (left- | last (right- | |||
bar, cen- | most) bar | most,bar | |||
tered on | segment | segment | |||
the middle | (length) | (length) | |||
bar | |||||
boxplot | Horizontal | NA | NA | NA | Name of |
box plot | output | ||||
list from | |||||
call to | |||||
boxplot(...plot=F) | |||||
dot | Dot | Values for | NA | NA | NA |
dots | |||||
dotconf | Dot with | Values | Values of | Values for | NA |
confidence | for dots | lower limits | upper limits tab | ||
interval | |||||
line | |||||
dotse | Dot with | Values for | Standard | NA | NA |
line length | dots | errors | |||
+/- standard | |||||
error | |||||
dotsignf | Dot | Values for | P value | NA | NA |
overprinted | dots | associated | |||
if not | with dot | ||||
significant | |||||
scatdot | Scater plot | Values on | Values on | NA | NA |
of dots | horizontal | vertical | |||
(x) axis | (y) axis | ||||
ts | Time Series | NA | NA | NA | Name of array |
(line) plot | with dimensions | ||||
of c(51,t,2), | |||||
where t = # | |||||
of time points | |||||
(max 15), x values | |||||
in [,,1], y values | |||||
in [,,2] | |||||
tsconf | Time Series | NA | NA | NA | Name of array |
(line) plot | with dimensions | ||||
with confidence | of c(51,t,4), as ts | ||||
limits | lower limit is | ||||
[,,3] amd the | |||||
upper limit is | |||||
[,,4] | |||||
The panelData data.frame is only used when a glyph requires more data per area than can be provided by the statsDFrame columns. Only glyphs using this vector are boxplots and time series.
In the case of the boxplot glyph, the boxplot function with plot=F is used to generate the boxplot statistical details for each area. The name of the resulting list of 51 sets of boxplot statistics (one for each area) is placed in the panelData data.frame element for the boxplot column.
For the time series and time series with confidence interval, the glyphs require a 3 dimensional array of data. The first dimension ([area,,]) represents the areas. The second dimension ([,t,]) ranges from 2 to n. There is no upper limit, but 200-250 samples is a practical limit. One for each data point. The third dimension ([,,v]) provides the values at data point t for area st. [,,1] is the x axis value. For time series, is usually just the value 1 to n to order the y values. [,,2] is the median y value. For time series with confidence intervals: [,,3] is the lower value y and [,,4] is the upper value y.
Reference Lines
Reference lines can be created in arror, bar, dot, dotconf, dotse, and segbar glyphs by specifying the reference values in the RefVal= vector. A label appearing at the bottom of the column can be specified using the RefTxt= vector in the panelDesc data.frame.
The parameters in the panelDesc data.frame structure are:
The types of graphics for each column of panels can be specified by the following keywords in the "type variable":
The following are the type of glyphs that can be specified in the type vector:
"map", "mapcum","maptail","mapmedian"
"id"
"dot", "dotse","dotconf", "dotsignf", "bar", "arrow", "ts", "tsconf","scatdot", "segbar", "normbar", "ctrbar", "boxplot"
The following provides a description of each panel type:
- a non-highlighted map
- maps show the accumulated areas top to bottom
- maps show the accumulated areas from the top and bottom toward median area.
- the maps above the median highlight the areas above the median area and maps below the median highlight areas below the median area based on the sorting variable.
- generates a column with a color identifier (a filled in square) and the area abbreviation or name. The plotNames parameter in the micromapSEER call controls whether the area's full name or 2 character abbreviation is displayed.
- sequentially number areas from 1 (highest rank) to "n" (lowest rank)
- an arrow from value 1 to value 2 with value 2 the head of the arrow.
- a bar for a single set of values, The values can be positive or negative.
- a boxplot for each area using a data.frame generated by the boxplot function with plot=F. The name of the boxplot data.frame is passed to micromapSEER using the panelData vector.
- a dot for a single value using one set of values.
- a dot for a single value and its standard error using two values.
- a dot for a single value and its confidence interval using three values.
- a dot for a single value overlaid if value is not significant using two values: value for dot and P value.
- a time series line plot for each area. The glyph use the panelData vector to get the name of a three (3) dimensional array the data for the plot. The array contains one entry per area, 1 to 30+ data points and the x and y values. See section on panelData below for more details. A reasonable upper limit to the number of points is between 200-300. Only a few will be selected to be used as X-Axis labels. The format of the X-Axis label is controled by the "xIsDate" attribute on the array being set to TRUE. If the "xIsDate" attribute is not set to TRUE, the X-Axis will be formated as numeric and axisScaling can be preformed. If the "xIsDate" attribute is TRUE, the default date format of " or less than 90, a short date format will be used of " The x-axis date feature will override the specification of the axisScale call parameters on time series glyph columns.
- a time series line and confidence interval band for each area. The glyph use the panelData vector to get the name of a three (3) dimensional array the data for the plot. The array contains one entry per area, 1 to n data points and the x, y, lower y and upper y values. See section on panelData below for more details. A reasonable upper limit to the number of points is between 200-300. The format of the X-Axis label is controled by the "xIsDate" attribute on the array. If the "xIsDate" is set to TRUE, the X-Axis values will be format using the default date format of " date format of " TRUE, the X-Axis will be formated as numeric and axisScaling can be preformed. The x-axis date feature will override the specification of the axisScale call parameters on time series glyph columns.
- a horizontal stacked (segmented) bar plot starting at 0 using data in the statsDFrame data.frame. The col1 and col2 columns are used to indicate the first and last columns in the statsDFrame data.frame that contain the contiguous bar segment values (lengths). For example: The data for a 5 segment bar glyph is in columns 4 through 8 in the statsDFrame (5 columns). col1 is set to 4 to identify the first column and col2 is set to 8 to identify the last column in the sequence. Column names may be used, but the column identified in col1 must preceed the column identified in col2.
- a stacked bar plot where the data is normalized for each area by dividing the bar segment values by the sum of the values for all of the bars. The stacked bar plot for each area then ranges from 0 to 100% (edge to edge). The col1 and col2 columns are used to identify the first and last columns for bar data in the statsDFrame in the same way as for the "segbar" glyph (see above.)
- a stacked bar plot where the bar segments are centered around the middle of the data. If there is an even number of segments, the 0 point is between the lower half and the upper half of the segments. If there is an odd number of segments, the center is the midpoint of the middle segment. The other segments are plotted to the left and right of the center point. The col1 and col2 columns are used to indicate the first and last columns in the statsDFrame data.frame that contain the contiguous bar segment values. (See "segbar" type above for more information.)
- a set of 51 points with an x and y value per area. All points are plotted in each panel with the key areas in the panel highlighted. col1 indicates statsDFrame column containing the x values and col2 indicates the column containing the y values.
Example: type=c("id","map","rank", "boxplot")
To specify a micromapSEER with three columns, left to right,
containing the area label, a map and a boxplot.
Vectors of index numbers or names of columns in statsDFrame data.frame to be used as data for graphics. The uses of these three vectors are defined below:
glyphs do
not use the col1, col2, or col3 vectors to locate data
in the statsDFrame data.frame. If these vectors are present, the
corresponding entires should be NA
for the respective columns.
uses col1 to specify a single data column in statsDFrame data.frame to be ploted.
uses col1 to specify the data column in statsDFrame data.frame for the length of the bar. The data value can be positive or negative.
uses col1 and col2 to specify the data columns in statsDFrame data.frame to be used as the estimate and standard error values, respectively.
uses col1 and col2 to specify the data columns in statsDFrame data.frame to be used as the value for the dot and its associated P value.
uses col1 and col2 to specify the data columns in statsDFrame data.frame for the beginning and end values of the arrow.
uses col1 and col2 to specify the first and last columns in the statsDFrame data.frame. The statsDFrame data.frame columns from col1 to col2 are used for the length values of each bar in the glyph. col1 must preceed col2 in the statsDFrame data.frame. The minimum number of data columns is 2 columns with a maximum of 9 columns.
uses col1 and col2 to specify the x and y values respectivefully for a dot for each of the 51 areas and DC in a scatter dot plot.
uses col1, col2, and col3 to specify the data columns in statsDFrame data.frame for the estimate value, lower confidence interval, and upper confidence interval values.
See the table above.
A numeric vector used to specify the proportional width size of a glyph column in relation to all other glyph columns. If used, values must be included for all glyph columns except for the map and id glyphs, which are fixed width columns. The width of a glyph column is determined by summing all of the colSize values and dividing the sum into the value for each glyph column to yield a percentage of the available width to be allocated to each column. For example: colSize=c(NA,NA,10,10,5,15), does not affect columns 1 and 2. The percentages for columns 3 through 6 are 25%, 25%, 12.5% and 37.5%. If 4 inches of space is available, the columns will be allocated: 1, 1, 0.5, and 1.5 inches. The column widths are still regulated by the minimum and maximum column widths set in the package. If a value is missing for non-map or id glyph, the package will a value equal to the average of the provided values.
Character vectors provide the two column labels (titles) lines at the
top of each column. If no label is required, use ""
for a blank line.
Character vector used as a label at the bottom of
each column. This is typically used to show units of measure. If no label
is required, use ""
for a blank line.
Character vector used as the vertical (y) axis label
for ts, tsconf, and scatdot glyphs.
If no label is required, use ""
for a blank line.
Is a list of object names providing the reference values for each graphic column. The reference value is displayed as a dashed vertical line for each panel in the specified column.
Is a list of 1 or 2 labels to be displayed at the bottom of each column to identify the reference value.
List of object names containing the boxplot data list and/or an array of time series
data for each area. If boxplot and time series data are not used in a column,
then associated object names should be NA
.
For boxplot data, each row name in the boxplot list must be the area abbreviation (2 character) for the area associated with the data. There must be the same number of rows as in the name table and statsDFrame table. Each row must be data produced by the boxplot function. The area location identifier used in the statsDFrame data and must be placed in the boxplot$names (names) attribute for that set of boxplot data to be able to associate the individual boxplots to each area.
For the time series glyph (ts), the data must be a three (3) dimensional array.
The first dimension [st,,]
represent one entry for each area (1 to 51).
The second dimension [,t,]
indexes up to 30+ data points for the area.
The third dimension [,,v]
are the data point values at each data point.
[,,var{1}]
is
the x value and [,,2]
is the median y value for the data point.
The rownames associated with the first dimension must be the area location ids used
in the statsDFrame table to link the elements of this structure the presentation
order of the areas.
For the time series with confidence intervals glyph (tsconf), the array is extended to include:
[,,3]
and [,,4]
for the lower y and upper y values.
For time series data, the order of the first dimension of the array must match the
area order in the statsDFrame. For example, the data in dataArray[1,,]
is the
the area identified in statsDFrame[1,]
The Date feature allows the caller to request the TS X-Axis labels be formated at dates. This requires the data in the TS array has valid date data as the X data. These are numbers based on 1970-1-1 being day zero in the computer calendar. There are many functions in R to convert to and from characters and date variable. In the past, before this feature, users had to do work-a-rounds by using year numbers or year and faction numbers. Once you have inserted the date X values into the array [,,1], modify the class of the array to add the "Date" class. micrpmapST will inspect the array and find the "Date" class, flag it for internal operations and remove it. The date format of " the date format will be changed to " The date feature is only available on the Time Series Glyphs.
If axisScale is set to "s" or "sn", they will be ignored for any TS glyph using the date feature.
The panelDesc data.frame is used to describe the content of the
micromapST plot to the function. It contains the index of the data
in the statsDFrame
data.frame, the types of graphics to be used in
each column, titles, column headers, reference values and labels, etc.
A descriptor may be omitted if none of the panel plots need it.
Daniel B. Carr, George Mason University, Fairfax VA, with contributions from Jim Pearson and Linda Pickle of StatNet Consulting, LLC, Gaithersburg, MD
The PlotVis function will plot the boundary data contained in a border group. The border group boundary information is stored as a data.frame with an "x", "y", "hole", and "key" columns. The "key" column identifies the name table entry the boundary point and polygon belongs to. The "x" and "y" are the coordinates for the boundary vertex in the polygon, and the "hole" column identifies if the polygon is a hole in the area identified by the "key". At the end of each sequence of vectex for a polygon, the first vertex and the last vertex must be the same point. This is followed by a "x" and "y" coordinates of c(NA,NA) to tell the turn off drawing and wait for the next point. It's primary purpose is to visualize boundary data that has been prepared for micromapST and assist in developing new border groups for micromapST.
PlotVis ( VisBrd, VisCol, xTitle=NULL, xAxes=FALSE, xLwd=0.05)
PlotVis ( VisBrd, VisCol, xTitle=NULL, xAxes=FALSE, xLwd=0.05)
VisBrd |
- is a data frame containing a VisBorders formated collection of boundaries to be plotted. Refer to the section under the discussion on the areaVisBorder data.frame and it's format. In a border group, this would be the areaVisBorder, L2VisBorder, RegVisBorder, or L3VisBorders data.frame. |
VisCol |
- is a vector of the colors to fill each polygon in the VisBorder data frame. One color value is required for each "NA" (end of polygon) in the VisBorders data.frame. This vector must have the same number of elements at the VisBorder has polygons (point sets ending in NA.) If multiple polygons represent a single area, the function caller must compensate for this in the VisCol vector. The usual strategy is to assign a color to each row in the Name Table, then compare the keys in the Name Table and the keys in the VisBorder and assign the color for that Key/Polygon. Warning: R's polygon function does not advance to the next color in this list, if the vectors between the "NA" rows represent a point or a line, R's polygon function will not advance to the next color, nothing to fill. This will affect which colors shade each polygon. |
xTitle |
- is character string value to be used for the title of the VisBorders plot. The default for this call parameter is NULL. Only a single string can be used. |
xAxes |
- is a logical (FALSE or TRUE) indicator as to whether or not the X and Y Axis should be drawn. The default for this call parameter is FALSE. |
xLwd |
- is a numerical value of the boundary line width to use when drawing the areas in the map. The value may range from .1 up to 3 times the standard width. The default value is 0.5. |
More details to follow.
None
Jim Pearson, StatNet Consulting, LLC, Gaithersburg, MD
Randomly generated statistics to create rate, count and population data for the 18 U.S. Seer registries for testing of micromapSEER or micromapST using the "USSeerBG" border group. The data was generated by the SeerStat program and exported as a CSV file. The non-data lines at the bottom of the file must be removed.
data(Seer18Area)
data(Seer18Area)
A data frame with 18 observations, 1 for each Seer registries (sub-areas), on the following 11 variables.
a character vector of the name of the Seer Registry
a numeric vector of the incident rates from 2002 to 2006
a numeric vector of the incident counts
a numeric vector of the population
a numeric vector of the incident rates from 2007 to 2011
a numeric vector of the incident counts
a numeric vector of the population
a numeric vector of the rate trends per Seer registry
a numeric vector of the P Value related to the Rate Trend
a numeric vector of the lower confidence interval
a numeric vector of the upper confidence interval
This dataset is a randomly generated collection of data from 2002 to 2011 Seer Registry data to support testing of micromapST functions for the 18 U.S. Seer registries. The data has no relationship to the Seer registries and is only for test purposes. The row names are the Seer registry abbreviations. This dataset is used in the micromapST examples.
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
This dataset contains the 2012 population and area data for each of the districts in city of Seoul.
data(SeoulPopData)
data(SeoulPopData)
A data frame with 25 observations, 1 for each district, on the following 5 variables.
a character vector containing the full name of the Seoul district.
a character vector containing the city name => Seoul.
a numeric vector of the number of the district's 2012 population.
a numeric vector of the area square kilometers for the district.
a character vector containing the dates each district was founded.
This dataset was pulled from the China ??? government website on Dec. 2014. It contains the population and average income per person for each of Kansas' 105 counties.
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The SeoulSKoreaBG border group dataset supports creating linked micromaps for the 25 districts in the city of Seoul South Korea. When the bordGrp call argument is set to SeoulSKoreaBG, the appropriate name table (county names and abbreviations) and the boundary data for the 25 districts are loaded in micromapST. The user's data is then linked to the boundary data via the district's name, abbreviated or ID based on the table below.
data(SeoulSKoreaBG)
data(SeoulSKoreaBG)
The SeoulSKoreaBG border group contains the following data.frames::
- contains specific parameters for the border group
- containing the names, abbreviations, and numerical identifier for the districts in the city of Seoul.
- the boundary point lists for each district in the city of Seoul Korea.
- the boundaries for an intermediate level. For this border group, this boundary data is not used and set to L3VisBorders as a place holder.
- the boundaries for an regional level. For this border group, this boundary data is not used and set to L3VisBorders as a place holder.
- the boundary of the city of Seoul South Korea
The Seoul district border group contains 25 district sub-areas. Each district has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. No regions or L2 boundaries are defined for the Seoul district border group. The RegVisBorders and and L2VisBorders data.frames are set the L3VisBorders data.frame. The regional feature is disabled. The L3VisBorders dataset contains the outline of the city of Seoul.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (district) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call parameter.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (district) in the name table, the sub-area is mapped but not colored.
The following are a list of the names, abbreviations, and ids for each country in the SeoulSKoreaBG border group.
name | ab | id |
Dobong | DO | 11100 |
Dongdaemun | DN | 11060 |
Dongjak | DG | 11200 |
Eunpyeong | EU | 11120 |
Gangdong | GA | 11090 |
Gangbuk | GN | 11250 |
Gangnam | GG | 11230 |
Gangseo | GS | 11160 |
Geumcheon | GE | 11180 |
Guro | GU | 11170 |
Gwanak | GW | 11210 |
Gwangjin | GJ | 11050 |
Jongro | JO | 11010 |
Jung | JU | 11070 |
Jungnang | JN | 11020 |
Mapo | MA | 11140 |
Nowon | NO | 11110 |
Seocho | SE | 11220 |
Seodaemun | SD | 11130 |
Seongbuk | SN | 11080 |
Seongdong | SG | 11040 |
Songpa | SO | 11240 |
Yangcheon | YA | 11150 |
Yeongdeungpo | YE | 11190 |
Yongsan | YO | 11030 |
The id field value is the ISO numerical code for the district.
The rowNames = "alias" or "alt_ab" and the regions features are not supported in the SeoulSKoreaBG border group.
US State 2010 population data by race and Hispanic ethnicity.
data(statePop2010)
data(statePop2010)
A data frame with 51 observations (one per state) on the following 6 variables per state:
an integer count of the Hispanic population
an integer count of the white population
an integer count of the black population
an integer count of the population other than white, black or Hispanic
a numeric percentage of the Hispanic population to the total population
a numeric percentage of the population other than white, black, or Hispanic
Each row has the state 2 character abbreviation as its row name.
The dataset contains 51 records, one for each state. The data represents the population count or percentage of the total population by race and Hispanic ethnicity within the state. This dataset is used by the micromapSEER examples with the "USStatesDF" border group.
United States Census Bureau, Population Total by State, by Race, Combinations of Two Races, and not Hispanic or Latino, 2010 (Summary File 1, Table QT-P4), URL = http://factfinder2.census.gov/.
When the user supplied data does not contain the exact Name or Abbr string to match the name table information. The data row cannot be linked or map to the micromapST graphics. An example is the many ways the District of Columbia in the U. S. is identified. Originally special code was used to identify these mismatches and correct them. The SynTable dataset now provide a open method to address the problem. The common string representing an area can be entered into the table and the correct Name and Abbr equivalent. When a data row's name does not match the name table, the Synonym Table is referenced to see it there is an alternative value to use.
data(SynTable)
data(SynTable)
A data frame with 51 observations (one per state) on the following 7 variables.
locid
a character string representing the name of the row in the user's provided data.frame.
Name
a character string containing the equivalent name table Name column value.
Abbr
a character string containing the equivalent name table Abbr column value.
Jim Pearson, Statnet Consulting, Gaithersburg, MD 20877
Data for Time Series Examples
The data are age-adjusted (2000 U.S. standard) female lung cancer mortality rates (per 100,000 population) for each year from 1996 to 2010.
data(TSdata)
data(TSdata)
This dataset is an array with dimensions of 51, 15, 4. The rownames of the array are the 51 state and DC abbreviations (2 characters). TSdata[,,1:4] contains the x (time) value, followed by the value for the line, then the lower 95% confidence limit, and finally the upper 95% confidence limit value.
The first dimension [st,,] of 51 elements contains each state or DC. This dimension is referenced by the rownames of the array.
The second dimension [,t,] of n elements in this case are the time periods in the time series. Our example uses the years 1996 to 2010 as the time period values. A reasonable number of points is between 20 and 30.
The third dimension [,,v] of 2 or 4 elements is the x or y values during the time period. If no confidence data is provided, the third dimension is 2:
data[,,1] is the X value
data[,,2] is the mid-Y value (Y)
If a confidence band is being plotted in \var{tsconf} graphs then there are 4 elements.
data[,,1] is the X value
data[,,2] is the mid-Y value (Y)
data[,,3] is the low-Y value
data[..4] is the high-Y value
For example, the x,y coordinates for year=1996 (time period = 1) for the first state (AK) is TSdata[1,1,c(1,2)].
This approach was done to allow a data matrix built for the "tsconf" glyphs to be used for a ts glyphs.
This data is used by micromapSEER with the "USStatesDF" border group.
# how to create a new time series data set tempTS <-read.table("...yourfilename.csv",sep=",",header=T) yrmat <-matrix(rep(1996:2010,51),nrow=51,ncol=15,byrow=T) # year labels ratemat<-as.matrix( tempTS[,c(8,13,18,23,28,33,38,43,48,53,58,63,68,73,78)] ) locimat<-as.matrix( tempTS[,c(9,14,19,24,29,34,39,44,49,54,59,64,69,74,79)] ) hicimat<-as.matrix( tempTS[,c(10,15,20,25,30,35,40,45,50,55,60,65,70,75,80)] ) workmat<-cbind(yrmat,ratemat,locimat,hicimat) TSdata <-NULL TSdata <-array(workmat,dim=c(51,15,4)) # change state ab from factors to characters. rownames(TSdata)<-as.character(tempTS$stab)
Surveillance, Epidemiology, and End Results (SEER) Program (www.seer.cancer.gov)
SEER*Stat Database: Mortality - All COD, Aggregated With State, Total U.S. (1969-2010)
(Katrina/Rita Population Adjustment), National Cancer Institute, DCCPS, Surveillance Research Program,
Surveillance Systems Branch, released April 2013.
Underlying mortality data provided by NCHS (www.cdc.gov/nchs).
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The UKIrelandBG border group dataset supports creating linked micromaps for the all of the United Kingdom including Northern Ireland and the Isle of Man and Ireland. When the bordGrp call argument is set to UKIrelandBG, the appropriate name table (county names and abbreviations) and the 219 sub-areas (counties, etc.) boundary data is loaded into micromapST. The user's data is then linked to the boundary data via the name, abbreviation, or alternate abbreviation for each sub-area (county, etc.).
The United Kingdom and Ireland information was pulled from the UK and Ireland public web sites in March of 2015.
The UKIreland border group was constructed to provide an area with more than 100 sub-areas for testing micromapST and enhancing it's ability to handle a large number of sub-areas and generate usable linked micromaps.
data(UKIrelandBG)
data(UKIrelandBG)
The UKIrelandBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, alternate abbreviation, or numerical identifier for each of the United Kingdom or Ireland counties.
- the boundary point lists for each sub-area.
- the boundaries for an intermediate level. For the United Kingdom and Ireland border group, this boundary point list is not used and is set to equal L3VisBorders data.frame for the border group.
- the boundaries for the 4 United Kingdom and Ireland regions or realms: England, Wales, Scotland, Northern Ireland, Ireland, and Isle of Man.
- the boundary of the United Kingdom and Ireland area.
The UKIreland border group contains 219 sub-areas (counties, etc.) Each registry has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. Regions are defined in this border group as the 6 country and kingdom regions in the UK and Ireland. The regions feature is enable. The siz (6) regions are: England, Scotland, Wales, Northern Ireland, Ireland and Isle of Man. The names, abbreviations, alternate abbreviations and IDs for the counties in the UKIreland border group are:
Name | ab | id | alt_ab | region |
Aberdeen | GB.ABE | 826139 | ABE | SCT |
Aberdeenshire | GB.ABD | 826140 | ABD | SCT |
c | GB.AGY | 826171 | AGY | WLS |
Angus | GB.ANS | 826141 | ANS | SCT |
Antrim | GB.ANT | 826113 | ANT | NIR |
Ards | GB.ARD | 826114 | ARD | NIR |
Argyll and Bute | GB.AGB | 826142 | AGB | SCT |
Armagh | GB.ARM | 826115 | ARM | NIR |
Ballymena | GB.BLA | 826116 | BLA | NIR |
Ballymoney | GB.BLY | 826117 | BLY | NIR |
Banbridge | GB.BNB | 826118 | BNB | NIR |
Barking and Dagenham | GB.BDG | 826001 | BDG | ENG |
Bath and North East Somerset | GB.BAS | 826002 | BAS | ENG |
Bedfordshire | GB.BDF | 826003 | BDF | ENG |
Belfast | GB.BFS | 826119 | BFS | NIR |
Berkshire | GB.BRK | 826004 | BRK | ENG |
Bexley | GB.BEX | 826005 | BEX | ENG |
Blackburn with Darwen | GB.BBD | 826006 | BBD | ENG |
Blaenau Gwent | GB.BGW | 826172 | BGW | WLS |
Bournemouth | GB.BMH | 826007 | BMH | ENG |
Brent | GB.BEN | 826008 | BEN | ENG |
Bridgend | GB.BGE | 826173 | BGE | WLS |
Brighton and Hove | GB.BNH | 826009 | BNH | ENG |
Bristol | GB.BST | 826010 | BST | ENG |
Bromley | GB.BRY | 826011 | BRY | ENG |
Buckinghamshire | GB.BKM | 826012 | BKM | ENG |
Caerphilly | GB.CAY | 826174 | CAY | WLS |
Cambridgeshire | GB.CAM | 826013 | CAM | ENG |
Camden | GB.CMD | 826014 | CMD | ENG |
Cardiff | GB.CRF | 826175 | CRF | WLS |
Carlow | IE.CW | 372001 | CW | IRE |
Carmarthenshire | GB.CMN | 826176 | CMN | WLS |
Carrickfergus | GB.CKF | 826120 | CKF | NIR |
Castlereagh | GB.CSR | 826121 | CSR | NIR |
Cavan | IE.CN | 372002 | CN | IRE |
Ceredigion | GB.CGN | 826177 | CGN | WLS |
Cheshire | GB.CHS | 826015 | CHS | ENG |
Clackmannanshire | GB.CLK | 826143 | CLK | SCT |
Clare | IE.CE | 372003 | CE | IRE |
Coleraine | GB.CLR | 826122 | CLR | NIR |
Conwy | GB.CWY | 826178 | CWY | WLS |
Cookstown | GB.CKT | 826123 | CKT | NIR |
Cork | IE.CO | 372004 | CO | IRE |
Cornwall | GB.CON | 826016 | CON | ENG |
Craigavon | GB.CGV | 826124 | CGV | NIR |
Croydon | GB.CRY | 826017 | CRY | ENG |
Cumbria | GB.CMA | 826018 | CMA | ENG |
Darlington | GB.DAL | 826019 | DAL | ENG |
Denbighshire | GB.DEN | 826179 | DEN | WLS |
Derby | GB.DER | 826020 | DER | ENG |
Derbyshire | GB.DBY | 826021 | DBY | ENG |
Derry | GB.DRY | 826125 | DRY | NIR |
Devon | GB.DEV | 826022 | DEV | ENG |
Donegal | IE.DL | 372005 | DL | IRE |
Dorset | GB.DOR | 826023 | DOR | ENG |
Down | GB.DOW | 826126 | DOW | NIR |
Dublin | IE.D | 372006 | D | IRE |
Dumfries and Galloway | GB.DGY | 826144 | DGY | SCT |
Dundee | GB.DND | 826145 | DND | SCT |
Dungannon | GB.DGN | 826127 | DGN | NIR |
Durham | GB.DUR | 826024 | DUR | ENG |
Ealing | GB.EAL | 826025 | EAL | ENG |
East Ayrshire | GB.EAY | 826146 | EAY | SCT |
East Dunbartonshire | GB.EDU | 826147 | EDU | SCT |
East Lothian | GB.ELN | 826148 | ELN | SCT |
East Renfrewshire | GB.ERW | 826149 | ERW | SCT |
East Riding of Yorkshire | GB.ERY | 826026 | ERY | ENG |
East Sussex | GB.ESX | 826027 | ESX | ENG |
Edinburgh | GB.EDH | 826150 | EDH | SCT |
Eilean Siar | GB.ELS | 826151 | ELS | SCT |
Enfield | GB.ENF | 826028 | ENF | ENG |
Essex | GB.ESS | 826029 | ESS | ENG |
Falkirk | GB.FAL | 826152 | FAL | SCT |
Fermanagh | GB.FER | 826128 | FER | NIR |
Fife | GB.FIF | 826153 | FIF | SCT |
Flintshire | GB.FLN | 826180 | FLN | WLS |
Galway | IE.G | 372007 | G | IRE |
Glasgow | GB.GLG | 826154 | GLG | SCT |
Gloucestershire | GB.GLS | 826030 | GLS | ENG |
Greenwich | GB.GRE | 826031 | GRE | ENG |
Gwynedd | GB.GWN | 826181 | GWN | WLS |
Hackney | GB.HCK | 826032 | HCK | ENG |
Halton | GB.HAL | 826033 | HAL | ENG |
Hammersmith and Fulham | GB.HMF | 826034 | HMF | ENG |
Hampshire | GB.HAM | 826035 | HAM | ENG |
Haringey | GB.HRY | 826036 | HRY | ENG |
Harrow | GB.HRW | 826037 | HRW | ENG |
Hartlepool | GB.HPL | 826038 | HPL | ENG |
Havering | GB.HAV | 826039 | HAV | ENG |
Herefordshire | GB.HEF | 826040 | HEF | ENG |
Hertfordshire | GB.HRT | 826041 | HRT | ENG |
Highland | GB.HLD | 826155 | HLD | SCT |
Hillingdon | GB.HIL | 826042 | HIL | ENG |
Hounslow | GB.HNS | 826043 | HNS | ENG |
Inverclyde | GB.IVC | 826156 | IVC | SCT |
Isle of Wight | GB.IOW | 826044 | IOW | ENG |
Islington | GB.ISL | 826045 | ISL | ENG |
Kensington and Chelsea | GB.KEC | 826046 | KEC | ENG |
Kent | GB.KEN | 826047 | KEN | ENG |
Kerry | IE.KY | 372008 | KY | IRE |
Kildare | IE.KE | 372009 | KE | IRE |
Kilkenny | IE.KK | 372010 | KK | IRE |
Kingston upon Hull | GB.KHL | 826048 | KHL | ENG |
Kingston upon Thames | GB.KTT | 826049 | KTT | ENG |
Lambeth | GB.LBH | 826050 | LBH | ENG |
Lancashire | GB.LAN | 826051 | LAN | ENG |
Laoighis | IE.LS | 372011 | LS | IRE |
Larne | GB.LRN | 826129 | LRN | NIR |
Leicester | GB.LCE | 826052 | LCE | ENG |
Leicestershire | GB.LEC | 826053 | LEC | ENG |
Leitrim | IE.LM | 372012 | LM | IRE |
Lewisham | GB.LEW | 826054 | LEW | ENG |
Limavady | GB.LMV | 826130 | LMV | NIR |
Limerick | IE.LK | 372013 | LK | IRE |
Lincolnshire | GB.LIN | 826055 | LIN | ENG |
Lisburn | GB.LSB | 826131 | LSB | NIR |
London | GB.LND | 826056 | LND | ENG |
Longford | IE.LD | 372014 | LD | IRE |
Louth | IE.LH | 372015 | LH | IRE |
Luton | GB.LUT | 826057 | LUT | ENG |
Magherafelt | GB.MFT | 826132 | MFT | NIR |
Manchester | GB.MAN | 826058 | MAN | ENG |
Mayo | IE.MO | 372016 | MO | IRE |
Meath | IE.MH | 372017 | MH | IRE |
Medway | GB.MDW | 826059 | MDW | ENG |
Merseyside | GB.MSY | 826060 | MSY | ENG |
Merthyr Tydfil | GB.MTY | 826182 | MTY | WLS |
Merton | GB.MRT | 826061 | MRT | ENG |
Middlesbrough | GB.MDB | 826062 | MDB | ENG |
Midlothian | GB.MLN | 826157 | MLN | SCT |
Milton Keynes | GB.MIK | 826063 | MIK | ENG |
Monaghan | IE.MN | 372018 | MN | IRE |
Monmouthshire | GB.MON | 826183 | MON | WLS |
Moray | GB.MRY | 826158 | MRY | SCT |
Moyle | GB.MYL | 826133 | MYL | NIR |
Neath Port Talbot | GB.NTL | 826184 | NTL | WLS |
Newham | GB.NWM | 826064 | NWM | ENG |
Newport | GB.NWP | 826185 | NWP | WLS |
Newry and Mourne | GB.NYM | 826134 | NYM | NIR |
Newtownabbey | GB.NTA | 826135 | NTA | NIR |
Norfolk | GB.NFK | 826065 | NFK | ENG |
North Ayrshire | GB.NAY | 826159 | NAY | SCT |
North Down | GB.NDN | 826136 | NDN | NIR |
North East Lincolnshire | GB.NEL | 826066 | NEL | ENG |
North Lanarkshire | GB.NLK | 826160 | NLK | SCT |
North Lincolnshire | GB.NLN | 826067 | NLN | ENG |
North Somerset | GB.NSM | 826068 | NSM | ENG |
North Yorkshire | GB.NYK | 826069 | NYK | ENG |
Northamptonshire | GB.NTH | 826070 | NTH | ENG |
Northumberland | GB.NBL | 826071 | NBL | ENG |
Nottingham | GB.NGM | 826072 | NGM | ENG |
Nottinghamshire | GB.NTT | 826073 | NTT | ENG |
Offaly | IE.OY | 372019 | OY | IRE |
Omagh | GB.OMH | 826137 | OMH | NIR |
Orkney Islands | GB.ORK | 826161 | ORK | SCT |
Oxfordshire | GB.OXF | 826074 | OXF | ENG |
Pembrokeshire | GB.PEM | 826186 | PEM | WLS |
Perthshire and Kinross | GB.PKN | 826162 | PKN | SCT |
Peterborough | GB.PTE | 826075 | PTE | ENG |
Plymouth | GB.PLY | 826076 | PLY | ENG |
Poole | GB.POL | 826077 | POL | ENG |
Portsmouth | GB.POR | 826078 | POR | ENG |
Powys | GB.POW | 826187 | POW | WLS |
Redbridge | GB.RDB | 826079 | RDB | ENG |
Redcar and Cleveland | GB.RCC | 826080 | RCC | ENG |
Renfrewshire | GB.RFW | 826163 | RFW | SCT |
Rhondda, Cynon, Taff | GB.RCT | 826188 | RCT | WLS |
Richmond upon Thames | GB.RIC | 826081 | RIC | ENG |
Roscommon | IE.RN | 372020 | RN | IRE |
Rutland | GB.RUT | 826082 | RUT | ENG |
Scottish Borders | GB.SCB | 826164 | SCB | SCT |
Shetland Islands | GB.ZET | 826165 | ZET | SCT |
Shropshire | GB.SHR | 826083 | SHR | ENG |
Sligo | IE.SO | 372021 | SO | IRE |
Somerset | GB.SOM | 826084 | SOM | ENG |
South Ayrshire | GB.SAY | 826166 | SAY | SCT |
South Gloucestershire | GB.SGC | 826085 | SGC | ENG |
South Lanarkshire | GB.SLK | 826167 | SLK | SCT |
South Yorkshire | GB.SYK | 826086 | SYK | ENG |
Southampton | GB.STH | 826087 | STH | ENG |
Southend-on-Sea | GB.SOS | 826088 | SOS | ENG |
Southwark | GB.SWK | 826089 | SWK | ENG |
Staffordshire | GB.STS | 826090 | STS | ENG |
Stirling | GB.STG | 826168 | STG | SCT |
Stockton-on-Tees | GB.STT | 826091 | STT | ENG |
Stoke-on-Trent | GB.STE | 826092 | STE | ENG |
Strabane | GB.STB | 826138 | STB | NIR |
Suffolk | GB.SFK | 826093 | SFK | ENG |
Surrey | GB.SRY | 826094 | SRY | ENG |
Sutton | GB.STN | 826095 | STN | ENG |
Swansea | GB.SWA | 826189 | SWA | WLS |
Swindon | GB.SWD | 826096 | SWD | ENG |
Telford and Wrekin | GB.TFW | 826097 | TFW | ENG |
Thurrock | GB.THR | 826098 | THR | ENG |
Tipperary | IE.TA | 372022 | TA | IRE |
Torbay | GB.TOB | 826099 | TOB | ENG |
Torfaen | GB.TOF | 826190 | TOF | WLS |
Tower Hamlets | GB.TWH | 826100 | TWH | ENG |
Tyne and Wear | GB.TWR | 826101 | TWR | ENG |
Vale of Glamorgan | GB.VGL | 826191 | VGL | WLS |
Waltham Forest | GB.WFT | 826102 | WFT | ENG |
Wandsworth | GB.WND | 826103 | WND | ENG |
Warrington | GB.WRT | 826104 | WRT | ENG |
Warwickshire | GB.WAR | 826105 | WAR | ENG |
Waterford | IE.WD | 372023 | WD | IRE |
West Dunbartonshire | GB.WDU | 826169 | WDU | SCT |
West Lothian | GB.WLN | 826170 | WLN | SCT |
West Midlands | GB.WMD | 826106 | WMD | ENG |
West Sussex | GB.WSX | 826107 | WSX | ENG |
West Yorkshire | GB.WYK | 826108 | WYK | ENG |
Westmeath | IE.WH | 372024 | WH | IRE |
Westminster | GB.WSM | 826109 | WSM | ENG |
Wexford | IE.WX | 372025 | WX | IRE |
Wicklow | IE.WW | 372026 | WW | IRE |
Wiltshire | GB.WIL | 826110 | WIL | ENG |
Worcestershire | GB.WOR | 826111 | WOR | ENG |
Wrexham | GB.WRX | 826192 | WRX | WLS |
York | GB.YOR | 826112 | YOR | ENG |
Isle of Man | IM | 833000 | IMN | IMN |
When compiling the abbreviations for this border group, multiple sets of abbreviations were found. The two most common sets are included in the border group as "ab" and "alt_ab" types of rowNames.
The L3VisBorders dataset contains the outline of the UK and Ireland.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (registry) using the fullname, abbreviation, alternate abbreviation and numerical identifier for each county/provence to the <statsDFrame> data based on the setting of the rowNames call argument.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area in the name table, the sub-area is mapped but not colored.
The dataRegionsOnly call parameter instructs the package to only map the regions with Seer registers with data. The regions are the four census regions: England, Scotland, Wales, Isle of Man, Northern Irelend and Ireland.
NCI
???? Retrieved 2013-01-10.
The UKIrelandPopData and UKIrelandPopData2 datasets contain the county populations and area statistics for the UK and Ireland..
data(UKIrelandPopData)
data(UKIrelandPopData)
A data frame with 218 rows, 1 for each county, with 5 variables for each county:
a character vector containing the UK-Ireland county name.
a character vector containing the ISO 2 character abbreviation for the UK and Ireland country. Warning: the UKIreland border group uses the newer ISO 3 character notation for it\'s abbreviations.
a numeric vector of the county's population.
a numeric vector of the county's area in square miles.
a numeric vector of the county's area in square kilometers.
This dataset was pulled from several UK and Ireland website in January, 2015. The difference between UKIrelandPopData and UKIrelandPopData2 is incorrect labeling of Anglesey as Anglesey, Isle of.
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The USSeerBG border group dataset supports creating linked micromaps for the 20 Seer registries in the U. S. When the bordGrp call argument is set to USSeerBG, the appropriate name table (county names and abbreviations) and the 20 sub-areas (Seer registries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the Seer registry's name, abbreviated, alias match or ID based on the table below.
The 20 U. S. Seer registries are the accepted registries as of January 2010 funded by NCI.
data(USSeerBG)
data(USSeerBG)
The USSeerBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, numerical identifier and alias matching string for each of the 20 Seer registries.
- the boundary point lists for each area.
- the boundaries for an intermediate level. For Seer registry border group, L2VisBorders contains the boundaries for the 51 states and DC in the U. S to help provide a geographical reference of the registries to the states.
- the boundaries for the 4 U. S. Census regions in the U. S in support of the region feature.
- the boundary of the U. S.
The Seer Registries border group contains 20 Seer Registry sub-areas. Each registry has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets.
Regions are defined in this border group as the 4 census regions in the U. S. The regions feature is enable. The four census regions are: NorthEast, South, MidWest, and West. The states and Seer registries in each region are:
state | Seer Registries | region |
Alabama | <none> | South |
Alaska | Alaska Natives | West |
Arizona | Arizona Natives | West |
Arkansas | <none> | South |
California | California-LA, | West |
California-Other, | ||
California-SF, | ||
California-SJ | ||
Colorado | <none> | West |
Connecticut | Connecticut | NorthEast |
Delaware | <none> | South |
District of Columbia | <none> | South |
Florida | <none> | South |
Georgia | Georgia-Atlanta, | South |
Georgia-Other, | ||
Georgia-Rural | ||
Hawaii | Hawaii | West |
Idaho | <none> | West |
Illinois | <none> | MidWest |
Indiana | <none> | MidWest |
Iowa | Iowa | MidWest |
Kansas | <none> | MidWest |
Kentucky | Kentucky | South |
Louisiana | Louisiana | South |
Maine | <none> | NorthEast |
Maryland | <none> | South |
Massachusetts | <none> | NorthEast |
Michigan | Michigan-Detroit | MidWest |
Minnesota | <none> | MidWest |
Mississippi | <none> | South |
Missouri | <none> | MidWest |
Montana | <none> | West |
Nebraska | <none> | MidWest |
Nevada | <none> | West |
New Hampshire | <none> | NorthEast |
New Jersey | New Jersey | NorthEast |
New Mexico | New Mexico | West |
New York | <none> | NorthEast |
North Carolina | <none> | South |
North Dakota | <none> | MidWest |
Ohio | <none> | MidWest |
Oklahoma | Oklahoma-Cherokee | South |
Oregon | <none> | West |
Pennsylvania | <none> | NorthEast |
Rhode Island | <none> | NorthEast |
South Carolina | <none> | South |
South Dakota | <none> | MidWest |
Tennessee | <none> | South |
Texas | <none> | South |
Utah | Utah | West |
Vermont | <none> | NorthEast |
Virginia | <none> | South |
Washington | Washington-Seattle | South |
West Virginia | <none> | South |
Wisconsin | <none> | MidWest |
Wyoming | <none> | West |
The L3VisBorders dataset contains the outline of the United States.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (registry) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (registry) in the name table, the sub-area (registry) is mapped but not colored.
The following are a list of the names, abbreviations, alias and IDs for each country in the USSeerBG border group.
Name | ab | alias string | id | counties | region |
Alaska Natives | AK-NAT | ALASKA NATIVES | 18 | all | West |
Arizona Natives | AZ-NAT | ARIZONA NATIVES | 20 | all | West |
California-LA | CA-LA | LOS ANGELES | 4 | Los Angeles | West |
California-SF | CA-SF | SAN FRANCISCO | 2 | Alameda, | West |
Contra Costa, | |||||
Marin, | |||||
San Francisco, | |||||
San Mateo | |||||
California-SJ | CA-SJ | SAN JOSE | 3 | Montersey | West |
San Benito, | |||||
Santa Clara, | |||||
Santa Cruz | |||||
California-Other | CA-OTH | CALIFORNIA EXCLUDING | 5 | all other counties | West |
Connecticut | CT | CONNECTICUT | 1 | all | NorthEast |
Georgia-Atlanta | GA-ATL | ATLANTA | 6 | Clayton, Cobb, DeKalb, | South |
Fulton, Gwinnett | |||||
Georgia-Rural | GA-RUR | RURAL GEORGIA | 8 | Glascock, Greene, Hancock, | South |
Jasper, Jefferson, Morgan, | |||||
Putnam, Taliaferro, Warren, | |||||
Washington | |||||
Georgia-Other | GA-OTH | GREATER GEORGIA | 7 | all other counties | South |
Hawaii | HI | HAWAII | 9 | all | West |
Iowa | IA | IOWA | 10 | all | MidWest |
Kentucky | KY | KENTUCKY | 14 | all | South |
Michigan-Detroit | MI-DET | DETROIT | 15 | Macomb, | MidWest |
Oakland, | |||||
Wayne | |||||
New Jersey | NJ | NEW JERSEY | 11 | all | NorthEast |
New Mexico | NM | NEW MEXICO | 12 | all | West |
Oklahoma-Cherokee | OK-CHE | OKLAHOMA | 19 | Adair, | South |
Cherokee, | |||||
Craig, | |||||
Delaware, | |||||
Mayes, | |||||
McIntosh, | |||||
Muskogee, | |||||
Nowata, | |||||
Ottawa, | |||||
Rogers, | |||||
Seqouyah, | |||||
Tulsa, | |||||
Wagnorer, | |||||
Washington | |||||
Utah | UT | UTAH | 16 | all | West |
Washington-Seattle | WA-SEA | SEATTLE | 17 | Clallam, | South |
Grays Harbor, | |||||
Island, | |||||
Jefferson, | |||||
King, | |||||
Kitsap, | |||||
Mason, | |||||
Pierce, | |||||
San Juan, | |||||
Skagit, | |||||
Snohomish, | |||||
Thurston, | |||||
Whatcom | |||||
The rowNames = alias and the regions = TRUE features are enabled in the USSeerBG border group.
The alias option is designed to allow the package to match the registry labels created by the Seer Stat website when exporting Seer data for analysis. The alias match is a "contains" match, so the registry field in the user data must "contain" the "alias" values listed in the above table. To help generalize the match, the user's registry value is stripped of any punctuation, control characters and multiple spaces (blanks, tabs, cr, lf) are reduced to a single blank and the string is converted to all upper case. Then the wild card match is performed.
The dataRegionOnly call parameter (when set to TRUE) instructs the package to only map the regions with Seer registers with data. The regions used are the four census regions: NorthEast, South, MidWest and West. The RegVisBorders data.frame contains the outline of each of these regions. For example: if Seer registry data is provided for the only the New Mexico, Utah and California Registries in the West region, then only the states and regional boundary for the West region are drawn.
The USSeerBG border group does not contain or support an alternate set of abbreviations. If rowNames is set to alt_ab, an warning is generated and the standard Seer registry abbreviations are used.
The following steps should be used to export data for micromapST's use from the SEER*Stat Website:
Log on to the SEER^Stat website.
Create the matrix of results you want in SEER*Stat.
Click on Matrix, Export, Results as Text File (if you created multiple matrices of results, make sure that the one you want to export is highlighted)
In the Matrix Export Options window, click on:
Output variables as Labels without quotes
Remove all thousands separators
Output variable names before data
Preserve matrix columns & rename fields
Leave defaults clicked for Line delimiter, Missing Character, and Field delimiter
Change names and locations of text and dictionary files from defaults to the appropriate name and directory location.
To read the resulting text file into R use the read.delim
function with header = TRUE.
Follow the read.delim
call with a str
function to verify the data was read correctly.
dataT <- read.delim("c:\datadir\seerstat.txt",header=FALSE) str(dataT)
NCI
United States National Cancer Institute Seer Website at www.seer.cancer.gov; Seer Software at seer.cancer.gov/seerstat.; United States Census Bureau, Geography Division. "Census Regions and Divisions of the United States" (PDF). Retrieved 2013-01-10.
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. When micromapSt function is used to micromap the U.S. States and DC areas, no border group needs to be specified. The USStatesBG border group is the same as the original sub-areas (states and DC) and boundaries used by the all previous versions of the micromapST package. By default micromapST loads the area fullnames, abbreviations, IDs and boundaries files for 50 U. S. states and the District of Columbia for the processing of user data and the creation of the requested linked micromap.
data(USStatesBG)
data(USStatesBG)
The USStatesBG border group contains in the following data.frames:
- specific parameters associated with this border group
- containing the names, abbreviations, numerical identifier and alias matching string for each of the 51 U. S. States and D.C.
- the boundary point lists for each of the 51 States and D.C..
- the boundaries for the U. S. states and DC. It is identical to areaVisBorders data.frame.
- the boundaries for the 4 U. S. census regions
- the boundary of the U.S
Refer to the section on the border group data.frames for a detailed discussion on the formats and usage of each of the above data.frame.
In this border group, there are 51 areas (states and DC) and information and names in the areaNamesAbbrIDs data.frame and there boundaries in the areaVisBorders data.frame. The L2VisBorders data.frame contains a copy of the areaVisBorders data.frame to allow heavier overlaying of the state and DC boundaries during mapping. The RegVisBorders data.frame contains the information and boundaries for the 4 U. S. census regions. The L3VisBorders dataset contains the information on the boundaries of the U.S.
Alaska and Hawaii are relocated to below California and Alaska is reduced in size by 50
The names, abbreviations, id, and assigned U. S. regions used in the areaNamesAbbrsIDs data.frame the the 50 U. S. States and DC are as follows:
name | ab | id | region |
Alabama | AL | 01 | South |
Alaska | AK | 02 | West |
Arizona | AZ | 04 | West |
Arkansas | AR | 05 | South |
California | CA | 06 | West |
Colorado | CO | 08 | West |
Connecticut | CT | 09 | NorthEast |
Delaware | DE | 10 | South |
DC | DC | 11 | South |
Florida | FL | 12 | South |
Georgia | GA | 13 | South |
Hawaii | HI | 15 | West |
Idaho | ID | 16 | West |
Illinois | IL | 17 | MidWest |
Indiana | IN | 18 | MidWest |
Iowa | IA | 19 | MidWest |
Kansas | KS | 20 | MidWest |
Kentucky | KY | 21 | South |
Louisiana | LA | 22 | South |
Maine | ME | 23 | NorthEast |
Maryland | MD | 24 | South |
Massachusetts | MA | 25 | NorthEast |
Michigan | MI | 26 | MidWest |
Minnesota | MN | 27 | MidWest |
Mississippi | MI | 28 | South |
Missouri | MO | 29 | MidWest |
Montana | MT | 30 | West |
Nebraska | NE | 31 | MidWest |
Neveda | NV | 32 | West |
New Hampshire | NH | 33 | NorthEast |
New Jersey | NJ | 34 | NorthEast |
New Mexico | NM | 35 | West |
New York | NY | 36 | NorthEast |
North Carolina | NC | 37 | South |
North Dakota | ND | 38 | MidWest |
Ohio | OH | 39 | MidWest |
Oklahoma | OK | 40 | South |
Oregon | OR | 41 | West |
Pennsylvania | PA | 42 | NorthEast |
Rhode Island | RI | 44 | NorthEast |
South Carolina | SC | 45 | South |
South Dakota | SD | 46 | MidWest |
Tennessee | TN | 47 | South |
Texas | TX | 48 | South |
Utah | UT | 49 | West |
Vermont | VT | 50 | NorthEast |
Virginia | VA | 51 | South |
Washington | WA | 53 | West |
West Virginia | WV | 54 | South |
Wisconsin | WI | 55 | MidWest |
Wyoming | WY | 56 | West |
All data must be tagged with the name, abbreviation or id strings to be able to find the associated boundaries information. With the USStatesBG only, the package will accept several variations on the full name for DC. They include: Washington, D.C., District of Columbia, and D. C. All can be with or without punctuation and upper and lower case. When detected, the name is translated to "DC".
The dataRegionsOnly call parameter can be set to TRUE to request the package to limit the mapping to only states in regional areas where data is being mapped and omit states and regions that do not contain data. This allows the caller to focus the micromaps on one of the four (4) census regions: NorthEast, South, Midwest, or West.
The rowNames = alias, rowNames = alt_ab are not support in the USStatesBG border group.
NIST - Federal Information Processing Standards and U. S. Census website.
NIST FIPS 6-4 Standards
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call parameter is used to specify the border group dataset for the geographical area. The UtahBG border group dataset is contained within this package and supports creating linked micromaps for the 29 counties in the state of Utah. When the bordGrp call parameter is set to UtahBG, the appropriate name table (county names and abbreviations) and the 29 sub-areas (countries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the county's name, abbreviation, alternate abbreviation, or ID based on the table below.
data(UtahBG)
data(UtahBG)
The UtahBG border group dataset contains the following data.frames:
- contains specific parameters for the border group
- containing the names, abbreviations, and numerical identifier for the 29 counties in the state of Utah.
- the boundary point lists for each county area in Utah.
- the boundaries for an intermediate level. For this border group, this boundary data.frame is not used and set to L3VisBorders as a place holder.
- the boundaries for an intermediate level. For this border group, this boundary data.frame is not used and set to L3VisBorders as a place holder.
- the boundary of the state of Utah
The Utah county border group contains 29 county sub-areas. Each county has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets. No regions are defined in the Utah county border group, so the L2VisBorders dataset is not used and the regions option is disabled. The L3VisBorders dataset contains the outline of the state of Utah.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (county) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.
A identified column (idCol) or the data.frame row.names must match one of the types of names in the border group's areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (county) in the name table, the sub-area is mapped but not colored.
The following are a list of the names, abbreviations, alternate abbreviations and IDs for each country in the UtahBG border group.
name | ab | alt_ab | id |
Beaver | BV | BEA | 49001 |
Box Elder | BE | BOX | 49003 |
Cache | CH | CAC | 49005 |
Carbon | CA | CAR | 49007 |
Daggett | DAG | DAG | 49009 |
Davis | DAV | DAV | 49011 |
Duchesne | DU | DUC | 49013 |
Emery | EM | EME | 49015 |
Garfield | GA | GAR | 49017 |
Grand | GR | GRA | 49019 |
Iron | IR | IRO | 49021 |
Juab | JB | JUA | 49023 |
Kane | KN | KAN | 49025 |
Millard | MI | MIL | 49027 |
Morgan | MO | MOR | 49029 |
Piute | PT | PIU | 49031 |
Rich | RH | RIC | 49033 |
Salt Lake | SL | SAL | 49035 |
San Juan | SJ | SNJ | 49037 |
Sanpete | SP | SNP | 49039 |
Sevier | SV | SEV | 49041 |
Summit | SU | SUM | 49043 |
Tooele | TO | TOO | 49045 |
Uintah | UI | UIN | 49047 |
Utah | UT | UTA | 49049 |
Wasatch | WS | WST | 49051 |
Washington | WA | WSH | 49053 |
Wayne | WN | WAY | 49055 |
Weber | WB | WEB | 49057 |
When compiling the information for the UtahBG border group, it was not clear what was the standard accepted abbreviation for each county. Therefore, two sets of abbreviations for each county are included. The first abbreviation set can be referenced by setting the rowNames call parameter to "ab". The second (alternate) abbreviation set can be used by setting the rowNames to "alt_ab".
The id field value is the 5 digit U. S. state and county FIPS code.
The rowNames = "alias" and the regions features are not supported in the UtahBG border group.
This dataset contains the Utah county populations for each year from 1940 to 2011.
data(UtahPopData)
data(UtahPopData)
A data frame with 29 rows, 1 for each county, with 73 variables for each county:
a character vector containing the Utah county full name.
a numeric vector of the county's population in 1940.
a numeric vector of the county's population in 1941.
a numeric vector of the county's population in 1942.
a numeric vector of the county's population in 1943.
a numeric vector of the county's population in 1944.
a numeric vector of the county's population in 1945.
a numeric vector of the county's population in 1946.
a numeric vector of the county's population in 1947.
a numeric vector of the county's population in 1948.
a numeric vector of the county's population in 1949.
a numeric vector of the county's population in 1950.
a numeric vector of the county's population in 1951.
a numeric vector of the county's population in 1952.
a numeric vector of the county's population in 1953.
a numeric vector of the county's population in 1954.
a numeric vector of the county's population in 1955.
a numeric vector of the county's population in 1956.
a numeric vector of the county's population in 1957.
a numeric vector of the county's population in 1958.
a numeric vector of the county's population in 1959.
a numeric vector of the county's population in 1960.
a numeric vector of the county's population in 1961.
a numeric vector of the county's population in 1962.
a numeric vector of the county's population in 1963.
a numeric vector of the county's population in 1964.
a numeric vector of the county's population in 1965.
a numeric vector of the county's population in 1966.
a numeric vector of the county's population in 1967.
a numeric vector of the county's population in 1968.
a numeric vector of the county's population in 1969.
a numeric vector of the county's population in 1970.
a numeric vector of the county's population in 1971.
a numeric vector of the county's population in 1972.
a numeric vector of the county's population in 1973.
a numeric vector of the county's population in 1974.
a numeric vector of the county's population in 1975.
a numeric vector of the county's population in 1976.
a numeric vector of the county's population in 1977.
a numeric vector of the county's population in 1978.
a numeric vector of the county's population in 1979.
a numeric vector of the county's population in 1980.
a numeric vector of the county's population in 1981.
a numeric vector of the county's population in 1982.
a numeric vector of the county's population in 1983.
a numeric vector of the county's population in 1984.
a numeric vector of the county's population in 1985.
a numeric vector of the county's population in 1986.
a numeric vector of the county's population in 1987.
a numeric vector of the county's population in 1988.
a numeric vector of the county's population in 1989.
a numeric vector of the county's population in 1990.
a numeric vector of the county's population in 1991.
a numeric vector of the county's population in 1992.
a numeric vector of the county's population in 1993.
a numeric vector of the county's population in 1994.
a numeric vector of the county's population in 1995.
a numeric vector of the county's population in 1996.
a numeric vector of the county's population in 1997.
a numeric vector of the county's population in 1998.
a numeric vector of the county's population in 1999.
a numeric vector of the county's population in 2000.
a numeric vector of the county's population in 2001.
a numeric vector of the county's population in 2002.
a numeric vector of the county's population in 2003.
a numeric vector of the county's population in 2004.
a numeric vector of the county's population in 2005.
a numeric vector of the county's population in 2006.
a numeric vector of the county's population in 2007.
a numeric vector of the county's population in 2008.
a numeric vector of the county's population in 2009.
a numeric vector of the county's population in 2010.
a numeric vector of the county's population in 2011.
.
This dataset was pulled from the Utah government website in January, 2015.
Counts and rates of age-adjusted (2000 U.S. standard) lung cancer mortality data among white women, aggregated for 1995-9 and 2000-4.
data(wflung00and95)
data(wflung00and95)
A data frame with 51 observations, 1 for each state + DC, on the following 12 variables.
a numeric vector of age-adjusted rates by state during 2000-4 for white females
a numeric vector of the number of white female lung cancer deaths during 2000-4
a numeric vector of the 95% confidence interval lower bound for white female 2000-4 rates
a numeric vector of the 95% confidence interval upper bound for white female 2000-4 rates
a numeric vector of the white female population during 2000
a numeric vector of the standard error of the white female 2000-4 rates
a numeric vector of age-adjusted rates by state during 1995-9 for white females
a numeric vector of the number of white female lung cancer deaths during 1995-9
a numeric vector of the 95% confidence interval lower bound for white female 1995-9 rates
a numeric vector of the 95% confidence interval upper bound for white female 1995-9 rates
a numeric vector of the white female population estimates for 1995
a numeric vector of the standard error of the white female 1995-9 rates
The rates on this file are directly age adjusted to the US 2000 standard population and are expressed as the number of deaths per 100,000 person-years. The row names are the 2 character postal codes for the states. The data represents the rates for two periods of time: 2000 to 2004 and 1995 to 1999. This dataset is used in the micromapSEER examples using the border group of "USStatesDF".
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
Surveillance Research Program, National Cancer Institute SEER*Stat software (https://www.seer.cancer.gov/seerstat), November 2007 data submission, released April 2008. Data originally provided to NCI by the National Center for Health Statistics.
Counts and age-adjusted rate of white female lung cancer for the total U.S. for the aggregated periods 1995-9 and 2000-4.
data(wflung00and95US)
data(wflung00and95US)
A data frame with 1 observation on the following 13 variables.
a numeric vector, state rate for 2000-2004
a numeric vector, state number of cases for 2000-2004
a numeric vector, lower end point for 95% confidence interval
a numeric vector, upper end point for 95% confidence interval
a numeric vector, state population fro 2000-2004
a numeric vector, state standard error
a numeric vector, state rate for 1995-1999
a numeric vector, state number of cases for 1995-1999
a numeric vector, lower end point for 95% confidence interval
a numeric vector, upper end point for 95% confidence interval
a numeric vector, state population for 2000-2004
a numeric vector, state standard error
See documentation for wflung00and95 for more details. The row name is the associated state abbreviation - 2 characters. This dataset is used in the micromapSEER examples using a border group of "USStatesDF"..
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
Surveillance Research Program, National Cancer Institute SEER*Stat software (https://www.seer.cancer.gov/seerstat), November 2007 data submission, released April 2008. Data originally provided to NCI by the National Center for Health Statistics.
none
Counts and rates of lung cancer mortality data among white women, aggregated for 2000-4 by county.
data(wflung00cnty)
data(wflung00cnty)
A data frame with 2577 observations on the following 6 variables.
a numeric vector of 5 digit fips codes identifying the state and the county
a numeric vector of age-adjusted rates by county during 2000-4 for white females
a numeric vector of the number of white female lung cancer deaths during 2000-4 by county
a numeric vector of the white female population in the county during 2000
a numberic vector of the 2 digit state fips code
a character vector of the 2 character state postal code
The rates on this file are directly age adjusted to the US 2000 standard population and are expressed as the number of deaths per 100,000 person-years. Counties with from 1 to 9 deaths are suppressed (deleted from the file). This dataset is used by the micromapSEER examples using the border group of "USStatesDF".
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
Surveillance Research Program, National Cancer Institute SEER*Stat software (http://www.seer.cancer.gov/seerstat), November 2007 data submission, released April 2008. Data originally provided to NCI by the National Center for Health Statistics.
FIP 6-4 Codes
Counts and rates of lung cancer mortality data among white men by state, aggregated for 1950-1969 and 1970-1994
data(wmlung5070)
data(wmlung5070)
A data frame with 51 observations, 1 for each state + DC, on the following 5 variables.
a numeric vector, state age-adjusted rates during 1950-69
a numeric vector, the number of lung cancer deaths during 1950-69
a numeric vector, state age-adjusted rates during 1970-94
a numeric vector, the number of lung cancer deaths during 1970-94
a numeric vector of the percent change in rate from 1950-69 to 1970-94
The rates on this file are directly age adjusted to the US 1970 standard population and are expressed as the number of deaths per 100,000 person-years. The row names are the 2 character postal codes for the states. Note that the data currently available on the NCI web site are from a later data submission and so may differ slightly (in first decimal place) from the rates provided here due to corrections to the dataset after its first publication. The name of each row is the state abbreviation - 2 characters. This dataset is used by the micromapSEER examples using the border group of "USStatesDF".
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
Surveillance Research Program, National Cancer Institute SEER*Stat software (https://www.seer.cancer.gov/seerstat), November 2007 data submission, released April 2008. Data originally provided to NCI by the National Center for Health Statistics.
Devesa SS, Grauman DJ, Blot WJ, Pennello GA, Hoover RN, Fraumeni, JF Jr. Atlas of cancer mortality in the United States: 1950-94, NIH Publication 99-4564, Bethesda, MD: National Cancer Institute
Count and age-adjusted rate of lung cancer mortality among white men for the total U.S., aggregated for 1950-69 and 1970-94.
data(wmlung5070US)
data(wmlung5070US)
A data frame with 1 observations on the following 5 variables.
a numeric vector, US age adjusted mortality rates for 1950-1969
a numeric vector, US number of cases from 1950-1969
a numeric vector, US age adjusted mortality rates for 1970-1994
a numeric vector, US number of cases from 1970-1994
a numeric vector, change from 1950-1969 to 1970-1994 US rates.
see wmlung5070 for further details. The row name is always US indicating US rates. This dataset is used by the micromapSEER examples using the border group of "USStatesDF".
Linda W. Pickle and Jim Pearson of StatNet Consulting, LLC, Gaithersburg, MD
None