Title: | Estimating Regional Trends of a Prevalence from a DHS and Similar Surveys |
---|---|
Description: | Spatial estimation of a prevalence surface or a relative risks surface, using data from a Demographic and Health Survey (DHS) or an analog survey, see Larmarange et al. (2011) <doi:10.4000/cybergeo.24606>. |
Authors: | Joseph Larmarange [aut, cre] |
Maintainer: | Joseph Larmarange <[email protected]> |
License: | CeCILL |
Version: | 5.0.0.9000 |
Built: | 2024-12-03 05:44:12 UTC |
Source: | https://github.com/larmarange/prevr |
prevR allows spatial estimation of a prevalence surface or a relative risks surface, using data from a Demographic and Health Survey (DHS) or an analog survey.
Package: | prevR |
Type: | Package |
Licence: | CeCILL-C - https://cecill.info/licences/Licence_CeCILL-C_V1-en.html |
Website: | https://larmarange.github.io/prevR/ |
This package performs a methodological approach for spatial estimation of regional trends of a prevalence using data from surveys using a stratified two-stage sample design (as Demographic and Health Surveys). In these kind of surveys, positive and control cases are spatially positioned at the centre of their corresponding surveyed cluster.
This package provides functions to estimate a prevalence surface using a kernel estimator with adaptative bandwidths of equal number of persons surveyed (a variant of the nearest neighbor technique) or with fixed bandwidths. The prevalence surface could also be calculated using a spatial interpolation (kriging or inverse distance weighting) after a moving average smoothing based on circles of equal number of observed persons or circles of equal radius.
With the kernel estimator approach, it's also possible to estimate a surface of relative risks.
For a quick demo, enter quick.prevR(fdhs)
.
For a full demo, enter demo(prevR)
.
The content of prevR can be broken up as follows:
Datasets
fdhs is a fictive dataset used for testing the package.
TMWorldBorders provides national borders of every countries in the World
and could be used to define the limits of the studied area.
Creating objects
prevR functions takes as input objects of class prevR.import.dhs()
allows to import easily, through a step by step procedure,
data from a DHS (Demographic and Health Surveys) downloaded from
http://www.measuredhs.com.as.prevR()
is a generic function to create an object of class
prevR.create.boundary()
could be used to select borders of a country and
transfer them to as.prevR()
in order to define the studied area.
Data visualization
Methods show()
, print()
and summary()
display a summary of a object of class
prevR.
The method plot()
could be used on a object of class
prevR for visualizing the studied area, spatial position
of clusters, number of observations or number of positive cases by cluster.
Data manipulation
The method changeproj()
changes the projection
of the spatial coordinates.
The method as.data.frame()
converts an object of
class prevR into a data frame.
The method export()
export data and/or the studied
area in a text file, a dbf file or a shapefile.
Data analysisrings()
calculates rings of equal number of
observations and/or equal radius.kde()
calculates a prevalence surface or a relative
risks surface using gaussian kernel density estimators (kde)
with adaptative bandwidths.krige()
executes a spatial interpolation using an
ordinary kriging.idw()
executes a spatial interpolation using an inverse
distance weighting (idw) technique.
prevR has been developed with funding from the French National Agency for Research on AIDS and Viral Hepatitis (ANRS - http://www.anrs.fr) and the French Research Institute for Sustainable Development (IRD - https://www.ird.fr), and technical support from LYSIS ([email protected]).
To cite prevR:
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe and
Meda Nicolas (2011) "Methods for mapping regional trends of HIV prevalence
from Demographic and Health Surveys (DHS)",
Cybergeo: European Journal of Geography, no 558,
https://journals.openedition.org/cybergeo/24606,
DOI: 10.4000/cybergeo.24606.
Joseph Larmarange [email protected]
IRD - CEPED (UMR 196 Université Paris Descartes Ined IRD)
Larmarange Joseph and Bendaud Victoria (2014) "HIV estimates at second subnational level from national population-based survey", AIDS, n° 28, p. S469-S476, DOI: 10.1097/QAD.0000000000000480
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe and Meda Nicolas (2011) "Methods for mapping regional trends of HIV prevalence from Demographic and Health Surveys (DHS)", Cybergeo: European Journal of Geography, n° 558, https://journals.openedition.org/cybergeo/24606, DOI: 10.4000/cybergeo.24606
Larmarange Joseph (2007) Prévalences du VIH en Afrique : validité d'une mesure, PhD thesis in demography, directed by Benoît Ferry, université Paris Descartes, https://theses.hal.science/tel-00320283.
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe Meda Nicolas and Ferry Benoît (2006), "Cartographier les données des enquêtes démographiques et de santé à partir des coordonnées des zones d'enquête", Chaire Quételet, 29 novembre au 1er décembre 2006, Université Catholique de Louvain, Louvain-la-Neuve, Belgique.
## Not run: par(ask = TRUE) # Creating an object of class prevR col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) str(dhs) print(dhs) plot(dhs, main = "Clusters position") plot(dhs, type = "c.type", main = "Clusters by residence") plot(dhs, type = "count", main = "Observations by cluster") plot(dhs, type = "flower", main = "Positive cases by cluster") # Changing coordinates projection plot(dhs, axes = TRUE) dhs <- changeproj( dhs, "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" ) print(dhs) plot(dhs, axes = TRUE) # Calculating rings of equal number of observations for different values of N dhs <- rings(dhs, N = c(100, 200, 300, 400, 500)) print(dhs) summary(dhs) # Prevalence surface for N=300 prev.N300 <- kde(dhs, N = 300, nb.cells = 200) plot( prev.N300["k.wprev.N300.RInf"], pal = prevR.colors.red, lty = 0, main = "Regional trends of prevalence (N=300)" ) # Smoothing ring radii surface (spatial interpolation by kriging) radius.N300 <- krige("r.radius", dhs, N = 300, nb.cells = 200) plot( radius.N300, pal = prevR.colors.blue, lty = 0, main = "Radius of circle (N=300)" ) par(ask = FALSE) ## End(Not run)
## Not run: par(ask = TRUE) # Creating an object of class prevR col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) str(dhs) print(dhs) plot(dhs, main = "Clusters position") plot(dhs, type = "c.type", main = "Clusters by residence") plot(dhs, type = "count", main = "Observations by cluster") plot(dhs, type = "flower", main = "Positive cases by cluster") # Changing coordinates projection plot(dhs, axes = TRUE) dhs <- changeproj( dhs, "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" ) print(dhs) plot(dhs, axes = TRUE) # Calculating rings of equal number of observations for different values of N dhs <- rings(dhs, N = c(100, 200, 300, 400, 500)) print(dhs) summary(dhs) # Prevalence surface for N=300 prev.N300 <- kde(dhs, N = 300, nb.cells = 200) plot( prev.N300["k.wprev.N300.RInf"], pal = prevR.colors.red, lty = 0, main = "Regional trends of prevalence (N=300)" ) # Smoothing ring radii surface (spatial interpolation by kriging) radius.N300 <- krige("r.radius", dhs, N = 300, nb.cells = 200) plot( radius.N300, pal = prevR.colors.blue, lty = 0, main = "Radius of circle (N=300)" ) par(ask = FALSE) ## End(Not run)
This function merges the slots clusters
et rings
of
a object of class prevR
.
## S3 method for class 'prevR' as.data.frame(x, ..., N = NULL, R = NULL, clusters.only = FALSE)
## S3 method for class 'prevR' as.data.frame(x, ..., N = NULL, R = NULL, clusters.only = FALSE)
x |
object of class |
... |
not used, for compatibility with the generic method
|
N |
integer or list of integers setting elements of
|
R |
integer or list of integers setting elements of
|
clusters.only |
return only the slot |
If clusters.only = TRUE
, the function will return only the
slot clusters
of x
.
Otherwise, slots clusters
and rings
of x
will be
merged in a unique data frame. The columns of rings
will be renamed
adding a suffix like .N300.RInf.
N
and R
define the elements of rings
to extract.
If not specified (NULL
), all the elements of rings
will
be included.
str(fdhs) str(as.data.frame(fdhs)) ## Not run: r.fdhs <- rings(fdhs, N = c(100, 200, 300)) str(r.fdhs) str(as.data.frame(r.fdhs, clusters.only = TRUE)) str(as.data.frame(r.fdhs)) str(as.data.frame(r.fdhs, N = 300)) ## End(Not run)
str(fdhs) str(as.data.frame(fdhs)) ## Not run: r.fdhs <- rings(fdhs, N = c(100, 200, 300)) str(r.fdhs) str(as.data.frame(r.fdhs, clusters.only = TRUE)) str(as.data.frame(r.fdhs)) str(as.data.frame(r.fdhs, N = 300)) ## End(Not run)
This function creates an object of class prevR
from a data frame.
as.prevR(data, col, boundary = NULL, proj = "+proj=longlat +datum=WGS84")
as.prevR(data, col, boundary = NULL, proj = "+proj=longlat +datum=WGS84")
data |
data frame, each line corresponding to an observed cluster. |
col |
vector identifying the columns of
See examples. |
boundary |
object of class sf::sf defining the studied area. |
proj |
projection of clusters coordinates used in |
Only "x", "y" "n" and "pos" are required in col
.
If "id" is not specified, a numerical identifier will be automatically
created.
If boundary
is not defined (NULL
), a rectangle corresponding to minimal
and maximal coordinates of data
will be used.
boundary
could be the result of the function create.boundary()
.
It's not possible to change projection of data
with as.prevR()
.
Use changeproj()
instead.
Object of class prevR
prevR
class, create.boundary()
,
changeproj()
, import.dhs()
.
col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) str(dhs) print(dhs)
col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) str(dhs) print(dhs)
This function converts map projection (and/or datum) used by an object of class prevR into another one.
## S4 method for signature 'prevR' changeproj(object, proj)
## S4 method for signature 'prevR' changeproj(object, proj)
object |
object of class prevR. |
proj |
new map projection. One of
(i) character: a string accepted by GDAL, (ii) integer, a valid EPSG value
(numeric), or (iii) an object of class |
changeproj()
transform the columns "x" and "y" of the slot
clusters
of object
and convert boundary
using the new
map projection defined by proj
.
If applicable, the slot rings
will be recalculated.
Return object
expressed in the projection proj
.
print(fdhs) plot(fdhs, axes = TRUE, main = "Projection: longitude/latitude") fdhs2 <- changeproj( fdhs, "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" ) print(fdhs2) plot(fdhs2, axes = TRUE, main = "Projection: UTM Zone 30")
print(fdhs) plot(fdhs, axes = TRUE, main = "Projection: longitude/latitude") fdhs2 <- changeproj( fdhs, "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" ) print(fdhs2) plot(fdhs2, axes = TRUE, main = "Projection: UTM Zone 30")
This function uses the data set TMWorldBorders. One or several countries can be selected and will be returned as an object of class sp::SpatialPolygons.
create.boundary( countries = NULL, multiple = FALSE, proj = "+proj=longlat +datum=WGS84" )
create.boundary( countries = NULL, multiple = FALSE, proj = "+proj=longlat +datum=WGS84" )
countries |
a vector of character string corresponding to the name of
the countries you want to extract from the dataset. If |
multiple |
should the dialog box allow multiple selection
(unused if |
proj |
projection of clusters coordinates used in |
Object of class sp::SpatialPolygons.
The result will be automatically plotted.
## Not run: boundary <- create.boundary() ## End(Not run) boundary <- create.boundary("Burkina Faso") boundary <- create.boundary("Burkina Faso", proj = "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" ) boundary <- create.boundary(countries = c("Burkina Faso", "Ghana", "Benin"))
## Not run: boundary <- create.boundary() ## End(Not run) boundary <- create.boundary("Burkina Faso") boundary <- create.boundary("Burkina Faso", proj = "+proj=utm +zone=30 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" ) boundary <- create.boundary(countries = c("Burkina Faso", "Ghana", "Benin"))
Direct label a ggplot2 grouped plot
direct.label_prevR(p, method = NULL, debug = FALSE)
direct.label_prevR(p, method = NULL, debug = FALSE)
p |
The ggplot object. |
method |
Method for direct labeling
(see |
debug |
Show debug output? |
The ggplot object with direct labels added.
This function is based on and similar to
directlabels::direct.label()
except that legend is not hidden.
This method could be used to export an object of class prevR in different formats (text, shapefile, dbase...)
## S4 method for signature 'prevR' export( object, element, format, file, N = NULL, R = NULL, clusters.only = FALSE, ext = NULL, sep = NULL, dec = NULL, ... )
## S4 method for signature 'prevR' export( object, element, format, file, N = NULL, R = NULL, clusters.only = FALSE, ext = NULL, sep = NULL, dec = NULL, ... )
object |
object of class prevR. |
element |
element to export: "clusters" or "boundary". |
format |
format: "dbf", "txt", csv", "csv2" or "shp"
(unused if |
file |
file name without extension. |
N |
integer or list of integers setting elements of |
R |
integer or list of integers setting elements of |
clusters.only |
export only the slot |
ext |
coerce the extension of the export file
(unused if |
sep |
coerce the field separator string
(unused if |
dec |
coerce the string to use for decimal point
(unused if |
... |
additional arguments transmitted to
sf::st_write, |
If element="boundary"
, the slot boundary
of
object
will be exported as a shapefile.
Otherwise, the slot clusters
, merged with the slot rings
,
will be exported.
See as.data.frame()
for details on the use of the parameters of N
,
R
et clusters.only
.
format
specifies the export format of the data frame returned by
as.data.frame()
:
"shp" | Shape File |
"dbf" | DBASE format |
"txt" | tabulated text |
"csv" | 'comma separated values' |
"csv2" | CSV variant using a semicolon as field separator |
ext
could be used to coerce the extension of the output file,
except for shapefile export, which will write four different files
(.shp, .shx, .dbf and .prj).
The "txt" format uses by default a tabulation as field separator and a point "." for decimal point.
The "csv" format uses a comma "," as field separator and a point "." as decimal point.
The "csv2" format is a variant using a semicolon ";" as field separator and a colon "," for decimal point, the Excel convention for CSV files in some Western European locales.
sep
and dec
could be used to coerce the field separator and
the decimal point (together with the "txt" format).
sf::st_write()
,
foreign::write.dbf()
, utils::write.table()
.
## Not run: export(fdhs, element = "boundary", file = "area") export(fdhs, element = "clusters", format = "shp", file = "points") dhs <- rings(fdhs, N = c(100, 300, 500)) export(dhs, element = "clusters", format = "csv", N = 300, file = "points") ## End(Not run)
## Not run: export(fdhs, element = "boundary", file = "area") export(fdhs, element = "clusters", format = "shp", file = "points") dhs <- rings(fdhs, N = c(100, 300, 500)) export(dhs, element = "clusters", format = "csv", N = 300, file = "points") ## End(Not run)
Data set generated by a Demographic and Health Survey (DHS) simulation on a fictitious country with a national prevalence of 10\ surveyed, distributed in 401 clusters. This dataset is composed of 3 objects:
fdhs.clusters
: data frame (one line per cluster).
fdhs.boundary
: object of class
sp::SpatialPolygons corresponding to the borders of the
fictitious country.
fdhs
: object of class prevR
returned by as.prevR()
using the two previous objects.
## Not run: str(fdhs) str(fdhs.clusters) str(fdhs.boundary) demo(prevR) ## End(Not run)
## Not run: str(fdhs) str(fdhs.clusters) str(fdhs.boundary) demo(prevR) ## End(Not run)
This step by step function guides users to import data from a Demographic and Health Survey (DHS) and create an object of class prevR.
import.dhs(file.sav, file.dbf)
import.dhs(file.sav, file.dbf)
file.sav |
DHS data (one individual per line) in SPSS format (.sav), downloaded from https://www.dhsprogram.com/. Could also be directly a data.frame. |
file.dbf |
GPS position of clusters in DATABASE format (.dbf), downloaded from https://www.dhsprogram.com/. Could also be directly a data.frame. |
If you don't provide the precise path of files, R will check the
working directory (see base::setwd()
).
To specify the file path, see base::file.path()
.
This function was developed specifically for importing DHS.
For a generic function for creating an object of class prevR,
see as.prevR()
.
as.prevR()
, prevR class.
## Not run: imported_data <- import.dhs("data.sav", "gps.dbf") ## End(Not run)
## Not run: imported_data <- import.dhs("data.sav", "gps.dbf") ## End(Not run)
rings
or the slot boundary
.Test if an object is of class prevR.
This function test if the class of an object is prevR.
It could be used to test the slot rings
or the slot boundary
.
is.prevR(object, slot = NULL)
is.prevR(object, slot = NULL)
object |
object to test. |
slot |
"clusters", "rings","boundary" or "proj". |
Slots rings
and boundary
are always present in an object of
class prevR, but rings
could be NULL
and
boundary
a sf::sf object with an attribute named valid
with the value FALSE
(when boundaries of the studied
area have not been specified explicitly).
If rings
is NULL
, is.prevR(object,"rings")
will return FALSE
.
If boundary
has an attribute valid
equal to FALSE
,
is.prevR(object,"boundary")
will return FALSE
.
TRUE
or FALSE
col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) is.prevR(dhs) is.prevR(dhs, "rings") is.prevR(dhs, "boundary") dhs <- rings(dhs, N = 300) is.prevR(dhs, "rings")
col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) is.prevR(dhs) is.prevR(dhs, "rings") is.prevR(dhs, "boundary") dhs <- rings(dhs, N = 300) is.prevR(dhs, "rings")
This function allows to calculate a prevalence surface (ratio of two intensity surfaces) and/or a relative risks surface (ratio of two density surfaces) using gaussian kernel estimators with adaptative bandwidths of equal number of observations or equal radius.
## S4 method for signature 'prevR' kde( object, N = NULL, R = NULL, weighted = TRUE, risk.ratio = FALSE, keep.details = FALSE, nb.cells = 100, cell.size = NULL, progression = TRUE, short.names = FALSE )
## S4 method for signature 'prevR' kde( object, N = NULL, R = NULL, weighted = TRUE, risk.ratio = FALSE, keep.details = FALSE, nb.cells = 100, cell.size = NULL, progression = TRUE, short.names = FALSE )
object |
object of class prevR. |
N |
integer or list of integers corresponding to the rings to use. |
R |
integer or list of integers corresponding to the rings to use. |
weighted |
use weighted data ( |
risk.ratio |
calculate a relative risks surface instead of a
prevalence surface ( |
keep.details |
return surface of positive cases and surface of observed cases? |
nb.cells |
number of cells on the longest side of the studied area
(unused if |
cell.size |
size of each cell (in the unit of the projection). |
progression |
show a progress bar? |
short.names |
should names of the output be short? |
This function calculates a prevalence surface as the ratio of the intensity surface (expressed in cases per surface unit) of positive cases on the intensity surface of observed cases and could also calculate a relative risks surface corresponding to the ratio of the density surface (whose integral has been normalized to one) of positive cases on density surface of observed cases.
This method is a variant of the nearest neighbor technique. Surfaces are
estimated using gaussian kernel estimators with adaptative bandwidths,
bandwidth size being determined by a minimum number of
observations in the neighborhood (see rings()
for more details).
Fixed bandwidths could also be used. More precisely, the bandwidth used is
half the radius of rings of equal number of observations or equal radius
(parameters N
and R
) calculated by the' function rings()
.
See references for a detailed explanation of the implemented methodology.
N
and R
determine the rings to use for the estimation. If they are not
defined, surfaces will be estimated for each available couples (N,R)
available in object
. Several estimations could be
simultaneously calculated if several values of N and R are defined.
A suggested value of N could be computed with Noptim()
.
Object of class sf::sf.
Surfaces are named according to the name of the corresponding N and R
(for example: k.prev.N300.RInf). If short.names
is TRUE
and
if there is only one combination of couples (N, R), variable names will not
be suffixed by the value of N and R.
Estimated variables are (depending on the function parameters) :
"k.pos" unweighted intensity surface of positive cases.
"k.obs" unweighted intensity surface of observed cases.
"k.prev" unweighted surface of prevalence (k.pos/k.obs).
"k.case" unweighted density surface of positive cases.
"k.control" unweighted density surface of observed cases.
"k.rr" unweighted surface of relative risks (k.case/k.control).
"k.wpos" weighted intensity surface of positive cases.
"k.wobs" weighted intensity surface of observed cases.
"k.wprev" weighted surface of prevalence (k.wpos/k.wobs).
"k.wcase" weighted density surface of positive cases.
"k.wcontrol" weighted density surface of observed cases.
"k.wrr" weighted surface of relative risks (k.wcase/k.wcontrol).
Results could be plotted with sf::plot()
or with ggplot2
using ggplot2::geom_sf()
. See examples.
prevR provides several continuous color palettes (see prevR.colors).
Results could be turned into a stars raster using
stars::st_rasterize()
.
To export to ASCII grid, rasterize the results with stars::st_rasterize()
,
convert to SpatRast
with terra::rast()
, extract the desired layer with
[[]]
and then use terra::writeRaster()
. See examples.
See the package sparr for another methodology to estimate relative risks surfaces, adapted for other kind of data than Demographic and Health Surveys (DHS).
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe and Meda Nicolas (2011) "Methods for mapping regional trends of HIV prevalence from Demographic and Health Surveys (DHS)", Cybergeo: European Journal of Geography, no 558, https://journals.openedition.org/cybergeo/24606, DOI: 10.4000/cybergeo.24606.
KernSmooth::bkde2D()
, rings()
, Noptim()
.
## Not run: dhs <- rings(fdhs, N = c(100, 200, 300, 400, 500)) prev.N300 <- kde(dhs, N = 300, nb.cells = 200) plot(prev.N300, lty = 0) library(ggplot2) ggplot(prev.N300) + aes(fill = k.wprev.N300.RInf) + geom_sf(colour = "transparent") + scale_fill_gradientn(colors = prevR.colors.red()) + theme_prevR_light() # Export k.wprev.N300.RInf surface in ASCII Grid r <- terra::rast(stars::st_rasterize(prev.N300)) # writeRaster(r[[2]], "kprev.N300.asc") ## End(Not run)
## Not run: dhs <- rings(fdhs, N = c(100, 200, 300, 400, 500)) prev.N300 <- kde(dhs, N = 300, nb.cells = 200) plot(prev.N300, lty = 0) library(ggplot2) ggplot(prev.N300) + aes(fill = k.wprev.N300.RInf) + geom_sf(colour = "transparent") + scale_fill_gradientn(colors = prevR.colors.red()) + theme_prevR_light() # Export k.wprev.N300.RInf surface in ASCII Grid r <- terra::rast(stars::st_rasterize(prev.N300)) # writeRaster(r[[2]], "kprev.N300.asc") ## End(Not run)
These functions execute a spatial interpolation of a variable of the slot
rings
of an object of class prevR. The method krige()
implements the ordinary kriging technique. The method idw()
executes
an inverse distance weighting interpolation.
## S4 method for signature 'ANY,prevR' krige( formula, locations, N = NULL, R = Inf, model = NULL, nb.cells = 100, cell.size = NULL, fit = "auto", keep.variance = FALSE, show.variogram = FALSE, ... ) ## S4 method for signature 'ANY,prevR' idw( formula, locations, N = NULL, R = Inf, nb.cells = 100, cell.size = NULL, idp = 2, ... )
## S4 method for signature 'ANY,prevR' krige( formula, locations, N = NULL, R = Inf, model = NULL, nb.cells = 100, cell.size = NULL, fit = "auto", keep.variance = FALSE, show.variogram = FALSE, ... ) ## S4 method for signature 'ANY,prevR' idw( formula, locations, N = NULL, R = Inf, nb.cells = 100, cell.size = NULL, idp = 2, ... )
formula |
variable(s) to interpolate (see details). |
locations |
object of class prevR. |
N |
integer or list of integers corresponding to the rings to use. |
R |
integer or list of integers corresponding to the rings to use. |
model |
a variogram model returned by the function |
nb.cells |
number of cells on the longest side of the studied area
(unused if |
cell.size |
size of each cell (in the unit of the projection). |
fit |
|
keep.variance |
return variance of estimates? |
show.variogram |
plot the variogram? |
... |
additional arguments transmitted to |
idp |
inverse distance weighting power (see |
formula
specifies the variable(s) to interpolate. Only variables
available in the slot rings
of locations
could be used. Possible values
are "r.pos"
, "r.n"
, "r.prev"
, "r.radius"
, "r.clusters"
, "r.wpos"
,
"r.wn"
or "r.wprev"
. Variables could be specified with a character
string or a formula (example: list(r.pos ~ 1, r.prev ~ 1}
. Only formula
like variable.name ~ 1
are accepted. For more complex interpolations,
use directly functions gstat::krige()
and gstat::idw()
from gstat.
N
and R
determine the rings to use for the interpolation. If they are
not defined, surfaces will be estimated for each available couples (N,R).
Several interpolations could be simultaneously calculated if several
variables and/or several values of N and R are defined.
A suggested value of N could be computed with Noptim()
.
In the case of an ordinary kriging, the method krige()
from prevR
will try to fit automatically a exponential variogram to the sample variogram
(fit = "auto"
). You can also specify directly the variogram to use with
the parameter model
.
Interpolations are calculated on a spatial grid obtained with
make.grid.prevR()
.
Object of class sf::sf.
The name of estimated surfaces depends on the name of the interpolated
variable, N and R (for example: r.radius.N300.RInf).
If you ask the function to return variance (keep.variance=TRUE
),
corresponding surfaces names will have the suffix .var.
Results could be plotted with sf::plot()
or with ggplot2
using ggplot2::geom_sf()
. See examples.
prevR provides several continuous color palettes (see prevR.colors).
Results could be turned into a stars raster using
stars::st_rasterize()
.
To export to ASCII grid, rasterize the results with stars::st_rasterize()
,
convert to SpatRast
with terra::rast()
, extract the desired layer with
[[]]
and then use terra::writeRaster()
. See examples.
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe and Meda Nicolas (2011) "Methods for mapping regional trends of HIV prevalence from Demographic and Health Surveys (DHS)", Cybergeo: European Journal of Geography, no 558, https://journals.openedition.org/cybergeo/24606, DOI: 10.4000/cybergeo.24606.
gstat::krige()
, gstat::idw()
, rings()
, Noptim()
.
## Not run: dhs <- rings(fdhs, N = c(100,200,300,400,500)) radius.N300 <- krige('r.radius', dhs, N = 300, nb.cells = 50) prev.krige <- krige(r.wprev ~ 1, dhs, N = c(100, 300, 500)) plot(prev.krige, lty = 0) library(ggplot2) ggplot(prev.krige) + aes(fill = r.wprev.N300.RInf) + geom_sf(colour = "transparent") + scale_fill_gradientn(colors = prevR.colors.red()) + theme_prevR_light() # Export r.wprev.N300.RInf surface in ASCII Grid r <- terra::rast(stars::st_rasterize(prev.krige)) # writeRaster(r[[2]], "wprev.N300.asc") ## End(Not run)
## Not run: dhs <- rings(fdhs, N = c(100,200,300,400,500)) radius.N300 <- krige('r.radius', dhs, N = 300, nb.cells = 50) prev.krige <- krige(r.wprev ~ 1, dhs, N = c(100, 300, 500)) plot(prev.krige, lty = 0) library(ggplot2) ggplot(prev.krige) + aes(fill = r.wprev.N300.RInf) + geom_sf(colour = "transparent") + scale_fill_gradientn(colors = prevR.colors.red()) + theme_prevR_light() # Export r.wprev.N300.RInf surface in ASCII Grid r <- terra::rast(stars::st_rasterize(prev.krige)) # writeRaster(r[[2]], "wprev.N300.asc") ## End(Not run)
This function generates a spatial rectangular grid from the slot
boundary
of an object of class prevR
; function used in
particular by the methods kde()
, krige()
and idw()
.
make.grid.prevR(object, nb.cells = 100, cell.size = NULL)
make.grid.prevR(object, nb.cells = 100, cell.size = NULL)
object |
object of class prevR. |
nb.cells |
number of cells on the longest side of the studied area
(unused if |
cell.size |
size of each cell (in the unit of the projection). |
This function generates a spatial rectangular grid, each cell
corresponding to the center of a square of side cell.size
.
If cell.size
is not defined, side of cells will be
calculated as the longest side of the slot boundary
of object
divided by nb.cells
.
Object of class sf::sfc (simple feature geometry list column).
make.grid.prevR(fdhs) make.grid.prevR(fdhs, nb.cells = 200)
make.grid.prevR(fdhs) make.grid.prevR(fdhs, nb.cells = 200)
Based on previous simulation work, the function suggests an optimal value for the N parameter based on national prevalence, the total number of observations and the number of clusters. See Larmarange et al. 2011 for more details.
Noptim(object)
Noptim(object)
object |
object of class prevR. |
an integer.
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe and Meda Nicolas (2011) "Methods for mapping regional trends of HIV prevalence from Demographic and Health Surveys (DHS)", Cybergeo: European Journal of Geography, no 558, https://journals.openedition.org/cybergeo/24606, DOI: 10.4000/cybergeo.24606.
Noptim(fdhs)
Noptim(fdhs)
Method plot
for object of class prevR.
Plot clusters, number of observations per cluster or number of positive cases
per cluster.
## S4 method for signature 'prevR,missing' plot( x, type = "position", add.legend = TRUE, legend.location = "bottomright", factor.size = 0.2, new.window = FALSE, axes = FALSE, ... )
## S4 method for signature 'prevR,missing' plot( x, type = "position", add.legend = TRUE, legend.location = "bottomright", factor.size = 0.2, new.window = FALSE, axes = FALSE, ... )
x |
object of class prevR. |
type |
graph to plot:
|
add.legend |
add a legend? |
legend.location |
legend location. |
factor.size |
scale factor of rings (for |
new.window |
plot in a new window? |
axes |
show axes? |
... |
additional arguments transmitted to |
Available values for legend.location
are: "bottomright",
"bottom", "bottomleft", "left", "topleft",
"top", "topright", "right" use "center".
Use main
to define a title and sub
for a subtitle
(see graphics::title()
).
graphics::title()
, graphics::legend()
.
plot(fdhs, type = "position", main = "position", axes = TRUE) plot(fdhs, type = "c.type", main = "c.type") plot(fdhs, type = "count", main = "count", factor.size = 0.1) plot(fdhs, type = "flower", main = "flower")
plot(fdhs, type = "position", main = "position", axes = TRUE) plot(fdhs, type = "c.type", main = "c.type") plot(fdhs, type = "count", main = "count", factor.size = 0.1) plot(fdhs, type = "flower", main = "flower")
Class used by the package prevR
clusters
data.frame
with observed data (one line per cluster).
Columns names are:
"id" cluster ID.
"x" longitude.
"y" latitude.
"n" number of valid observations per cluster.
"pos" number of positive cases per cluster.
"prev" observed prevalence (in %) in the cluster (pos/n).
"wn" (optional) sum of weights of observations per cluster.
"wpos" (optional) sum of weights of positive cases per cluster.
"wprev" (optional) weighted observed prevalence (in %) in the cluster (wpos/wn).
"c.type" (optional) cluster type.
boundary
object of class sf::sf, borders of the studied area.
proj
object of class sf::crs, map projection used.
rings
list of results returned by rings()
.
Each entry is composed of 3 elements: N
, minimum number of
observations per ring; R
, maximum radius of rings and
estimates
, a data frame with the following
variables:
"id" cluster ID.
"r.pos" number of positive cases inside the ring.
"r.n" number of valid observations inside the ring.
"r.prev" observed prevalence (in \
"r.radius" ring radius (in kilometers if coordinates in decimal degrees, in the unit of the projection otherwise).
"r.clusters" number of clusters located inside the ring.
"r.wpos" (optional) sum of weights of positive cases inside the ring.
"r.wn" (optional) sum of weights of valid observations inside the ring.
"r.wprev" (optional) weighted observed prevalence (in %) inside the ring (r.wpos/r.wn).
Note: the list rings
is named, the name of each element is
NN_value.RR_value, for example N300.RInf.
Objects of this class could be created by the function as.prevR()
.
signature(x = "prevR")
converts an object of
class prevR into a data frame.
signature(object = "prevR")
generates a
spatial grid.
signature(object = "prevR")
exports a prevR object as
a shapefile, a dbase file or a text file.
signature(formula = "ANY", locations = "prevR")
calculates a spatial interpolation using an inverse distance weighting.
signature(object = "prevR")
estimates a prevalence
surface using kernel density estimators.
signature(formula = "ANY", locations = "prevR")
calculates a spatial interpolation by kriging.
signature(x = "prevR", y = "ANY")
plots data of a
prevR object.
signature(x = "prevR")
shows a summary of a prevR
object.
signature(object = "prevR")
calculates rings of equal
number of observations and/or equal radius.
signature(object = "prevR")
shows a summary of a prevR
object.
signature(object = "prevR")
shows a summary of the
variables of a prevR object.
signature(object = "prevR")
changes the map
projection used.
as.prevR()
, is.prevR()
, changeproj()
, rings()
, print()
, plot()
,
summary()
, kde()
, krige()
, idw()
, export()
.
showClass("prevR") col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) str(dhs) print(dhs) ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) str(dhs) print(dhs) ## End(Not run)
showClass("prevR") col <- c( id = "cluster", x = "x", y = "y", n = "n", pos = "pos", c.type = "residence", wn = "weighted.n", wpos = "weighted.pos" ) dhs <- as.prevR(fdhs.clusters, col, fdhs.boundary) str(dhs) print(dhs) ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) str(dhs) print(dhs) ## End(Not run)
Functions generating color palettes usable with R graphical functions.
These palettes are continuous, contrast being accentuated by darkening
and lightening extreme values. prevR.demo.pal
plot the available
palettes. prevR.colors.qgis.pal
export a palette in a text file
readable by Quantum GIS, an open-source mapping software.
prevR.colors.blue(n = 10) prevR.colors.blue.inverse(n = 10) prevR.colors.gray(n = 10) prevR.colors.gray.inverse(n = 10) prevR.colors.green(n = 10) prevR.colors.green.inverse(n = 10) prevR.colors.red(n = 10) prevR.colors.red.inverse(n = 10) prevR.demo.pal(n, border = if (n < 32) "light gray" else NA, main = NULL) prevR.colors.qgis.pal(file, at, pal = "red", inverse = FALSE)
prevR.colors.blue(n = 10) prevR.colors.blue.inverse(n = 10) prevR.colors.gray(n = 10) prevR.colors.gray.inverse(n = 10) prevR.colors.green(n = 10) prevR.colors.green.inverse(n = 10) prevR.colors.red(n = 10) prevR.colors.red.inverse(n = 10) prevR.demo.pal(n, border = if (n < 32) "light gray" else NA, main = NULL) prevR.colors.qgis.pal(file, at, pal = "red", inverse = FALSE)
n |
number of different colors in the palette. |
border |
border color. |
main |
title. |
file |
file name with extension. |
at |
list of values of the palette. |
pal |
color palette to use ("red", "green", "blue" or "gray"). |
inverse |
use the inverse palette? |
prevR.colors.red()
produces a color gradation from white/yellow
to red/dark red.
prevR.colors.blue()
produces a color gradation from light blue
to dark blue.
prevR.colors.green()
produces a color gradation from light green
to dark green.
prevR.colors.gray()
produces a color gradation from white/light gray to
dark gray/black.
Functions with a suffix .inverse produce the same color gradation, but from dark colors to light ones.
prevR.demo.pal()
plot the color palettes.
prevR.colors.qgis.pal()
export a color palette in a text file readable
by Quantum GIS.
The other functions return a list of colors coded in hexadecimal.
To obtain the list of colors in RGB (Red/Green/Blue), use the function
grDevices::col2rgb()
.
The code of prevR.demo.pal()
was adapted from the function demo.pal
presented in the examples of grDevices::rainbow()
.
Other color palettes are available in R. See for example
grDevices::rainbow()
or the package RColorBrewer.
prevR.demo.pal(25) prevR.colors.red(5) col2rgb(prevR.colors.red(5)) ## Not run: prevR.colors.qgis.pal("palette.txt", seq(0, 25, length.out = 100), "red") ## End(Not run)
prevR.demo.pal(25) prevR.colors.red(5) col2rgb(prevR.colors.red(5)) ## Not run: prevR.colors.qgis.pal("palette.txt", seq(0, 25, length.out = 100), "red") ## End(Not run)
Method print
for objects of class prevR:
shows a summary of the object's characteristics.
## S4 method for signature 'prevR' print(x)
## S4 method for signature 'prevR' print(x)
x |
object of class prevR. |
Exactly the same as show()
.
print(fdhs) ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) print(dhs) ## End(Not run)
print(fdhs) ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) print(dhs) ## End(Not run)
This function performs several analysis in one go:
(i) apply rings()
;
(ii) compute prevalence surface with kde()
;
(iii) compute the surface of rings radii with krige()
;
(iv) plot prevalence surface using prevR.colors.red()
and add rings radii
as a contour plot.
quick.prevR( object, N = Noptim(object), nb.cells = 100, cell.size = NULL, weighted = NULL, plot.results = TRUE, return.results = FALSE, return.plot = FALSE, legend.title = "%", cex = 0.7, progression = TRUE )
quick.prevR( object, N = Noptim(object), nb.cells = 100, cell.size = NULL, weighted = NULL, plot.results = TRUE, return.results = FALSE, return.plot = FALSE, legend.title = "%", cex = 0.7, progression = TRUE )
object |
object of class prevR. |
N |
integer or list of integers corresponding to the rings to use. |
nb.cells |
number of cells on the longest side of the studied area
(unused if |
cell.size |
size of each cell (in the unit of the projection). |
weighted |
use weighted data (TRUE, FALSE or "2")? |
plot.results |
plot the results? |
return.results |
return the results? |
return.plot |
return the plot within the results? |
legend.title |
title of the legend |
cex |
to control the text size on the graph |
progression |
show a progress bar? |
N
determine the rings to use for the estimation.
By default, a suggested value of N will be computed with Noptim()
.
A list of one or several elements, depending on the arguments:
(i) prev
is a SpatialPixelsDataFrame
containing the prevalence
surface; (ii) radius
a SpatialPixelsDataFrame
containing the
kriged surface of the rings radii; (iii) plot
a ggplot
graph.
Noptim()
, rings()
, kde()
and krige()
.
## Not run: quick.prevR(fdhs) ## End(Not run)
## Not run: quick.prevR(fdhs) ## End(Not run)
For each cluster, this function determines a ring of equal number of observations and/or equal radius and calculates several indicators from observations located inside that ring.
## S4 method for signature 'prevR' rings(object, N = seq(100, 500, 50), R = Inf, progression = TRUE)
## S4 method for signature 'prevR' rings(object, N = seq(100, 500, 50), R = Inf, progression = TRUE)
object |
object of class prevR. |
N |
minimum number of observations. |
R |
maximum rings radius (in kilometers if coordinates in decimal degrees, in the unit of the projection otherwise). |
progression |
show a progress bar? |
For each row of the data frame clusters
of object
, rings()
determines a ring, centered on the cluster. It could be:
rings of equal number of observations if N
is finite and R = Inf
;
rings of equal radius if N = Inf
and R
is finite;
a combination of both (see below) if N
and R
are both finite.
For *rings of equal number of observations, rings()
selects the smallest
ring containing at least N
valid observations.
For rings of equal radius, rings()
selects all clusters located at a
lower distance than R
from the central cluster.
For combination of both, rings()
calculates first the ring with the
minimum number of observations and test if its radius is lower than R
or
not. If so, the ring is kept, otherwise the ring of maximum radius is
calculated.
Different series of rings could be simultaneously calculated by providing
different values for N
and R
. rings()
will calculate rings
corresponding to each couple (N
,R
).
Return object
with the slot rings
completed for each couple
(N,R).
Each entry is composed of 3 elements: N
, minimum number of observations per
ring; R
, maximum radius of rings and estimates
, a data frame with the
following variables:
"id" cluster ID.
"r.pos" number of positive cases inside the ring.
"r.n" number of valid observations inside the ring.
"r.prev" observed prevalence (in %) inside the ring (r.pos/r.n).
"r.radius" ring radius (in kilometers if coordinates in decimal degrees, in the unit of the projection otherwise).
"r.clusters" number of clusters located inside the ring.
"r.wpos" (optional) sum of weights of positive cases inside the ring.
"r.wn" (optional) sum of weights of valid observations inside the ring.
"r.wprev" (optional) weighted observed prevalence (in %) inside the ring (r.wpos/r.wn).
Note: the list rings
is named, the name of each element is
NN_value.RR_value, for example N300.RInf.
Note 2: r.wpos, r.wn and r.wprev are calculated only if
the slot clusters
of object
contains weighted data.
Larmarange Joseph, Vallo Roselyne, Yaro Seydou, Msellati Philippe and Meda Nicolas (2011) "Methods for mapping regional trends of HIV prevalence from Demographic and Health Surveys (DHS)", Cybergeo : European Journal of Geography, no 558, https://journals.openedition.org/cybergeo/24606, DOI: 10.4000/cybergeo.24606.
Larmarange Joseph (2007) Prévalences du VIH en Afrique : validité d'une mesure, PhD thesis in demography, directed by Benoît Ferry, université Paris Descartes, https://theses.hal.science/tel-00320283.
## Not run: print(fdhs) dhs <- rings(fdhs, N = c(100, 200, 300, 400, 500)) print(dhs) ## End(Not run)
## Not run: print(fdhs) dhs <- rings(fdhs, N = c(100, 200, 300, 400, 500)) print(dhs) ## End(Not run)
Method show
for objects of class prevR:
shows a summary of the object's characteristics.
## S4 method for signature 'prevR' show(object)
## S4 method for signature 'prevR' show(object)
object |
object of class prevR. |
Exactly the same as print()
.
fdhs ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) dhs ## End(Not run)
fdhs ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) dhs ## End(Not run)
This function forces points of an object of class
[
sf] located outside
the limits defined by an object of class sp::SpatialPolygons
to NA
.
st_filter_prevR(x, y)
st_filter_prevR(x, y)
x |
object of class sf::sf |
y |
object of class sf::sf |
The function try to apply sf::st_filter()
. In case it fails,
it will try to rebuild y
according to spherical geometry
(see sf::st_as_s2()
) before filtering. If it still fail, it will return
x
unfiltered.
Return x
filtered by y
Method summary
for objects of class prevR:
shows a summary of the variables of the object.
## S4 method for signature 'prevR' summary(object, probs = c(0, 0.1, 0.25, 0.5, 0.75, 0.8, 0.9, 0.95, 0.99, 1))
## S4 method for signature 'prevR' summary(object, probs = c(0, 0.1, 0.25, 0.5, 0.75, 0.8, 0.9, 0.95, 0.99, 1))
object |
object of class prevR. |
probs |
vector of probabilities with values in |
summary(fdhs) ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) summary(dhs) summary(dhs, c(0, 0.25, 0.5, 0.75, 1)) ## End(Not run)
summary(fdhs) ## Not run: dhs <- rings(fdhs, N = c(100, 300, 500)) summary(dhs) summary(dhs, c(0, 0.25, 0.5, 0.75, 1)) ## End(Not run)
Two custom themes for ggplot2 graphs, hiding axis.
theme_prevR(base_size = 12) theme_prevR_light(base_size = 12)
theme_prevR(base_size = 12) theme_prevR_light(base_size = 12)
base_size |
base font size |
This dataset provides boundaries of all countries in the world, in decimal degrees. Available variables are:
"FIPS" FIPS 10-4 Country Code.
"ISO2" ISO 3166-1 Alpha-2 Country Code.
"ISO3" ISO 3166-1 Alpha-3 Country Code.
"UN" ISO 3166-1 Numeric-3 Country Code.
"NAME" Name of country/area.
"AREA" Land area, FAO Statistics (2002).
"POP2005" Population, World Population Prospects (2005).
"REGION" Macro geographical (continental region), UN Statistics.
"SUBREGION" Geographical sub-region, UN Statistics.
"LON" Longitude.
"LAT" Latitude.
Object of class sp::SpatialPolygonsDataFrame.
The boundaries, names designations used do not imply official endorsement or acceptance by the authors. Use this dataset with care, as several of the borders are disputed.
Provided by Bjorn Sandvik on The dataset was derived by Schuyler Erle from public domain sources. Sean Gilles did some clean up and made some enhancements. The dataset is available under a Creative Commons Attribution-Share Alike License (https://creativecommons.org/licenses/by-sa/3.0/).
plot(TMWorldBorders["NAME"])
plot(TMWorldBorders["NAME"])
Update an object of class prevR
created with a previous version of the
package to the last version. In particular, it will convert any boundary
slot defined with the sp
package to sf
class.
update_prevR(object)
update_prevR(object)
object |
a |
a prevR
object
Several functions (for example KernSmooth::bkde2D()
)
return a surface as a list "xyz" composed of three elements:
vector of ordinates in the x dimension,
vector of ordinates in the y dimension and
a matrix with the values of the surface in x and y.
This function transforms a list "xyz" into a data frame.
xyz2dataframe(xyz, xcol = 1, ycol = 2, zcol = 3)
xyz2dataframe(xyz, xcol = 1, ycol = 2, zcol = 3)
xyz |
a list with 3 elements: a vector with x-coordinates,
a vector with y-coordinates and
and matrix with value for each point of coordinates |
xcol |
x index. |
ycol |
y index. |
zcol |
z index. |
A data.frame
.
xyz
could be a list like x,y,z1,z2,z3
.
If so, zcol
should be equal
to c("z1","z2","z3")
or c(3,4,5)
.
x <- matrix(c(2, 4, 6, 8, 10, 2, 4, 6, 8, 10), ncol = 2) op <- KernSmooth::bkde2D(x, bandwidth = 1) str(op) op.df <- xyz2dataframe(op) str(op.df)
x <- matrix(c(2, 4, 6, 8, 10, 2, 4, 6, 8, 10), ncol = 2) op <- KernSmooth::bkde2D(x, bandwidth = 1) str(op) op.df <- xyz2dataframe(op) str(op.df)