% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/normGeometry.R
\name{normGeometry}
\alias{normGeometry}
\title{Normalise geometries}
\usage{
normGeometry(
  input = NULL,
  pattern = NULL,
  query = NULL,
  thresh = 10,
  beep = NULL,
  simplify = FALSE,
  stringdist = TRUE,
  strictMatch = FALSE,
  verbose = FALSE
)
}
\arguments{
\item{input}{\code{\link[=character]{character(1)}}\cr path of the file to normalise. If
this is left empty, all files at stage two as subset by \code{pattern} are
chosen.}

\item{pattern}{\code{\link[=character]{character(1)}}\cr an optional regular expression.
Only dataset names which match the regular expression will be processed.}

\item{query}{\code{\link[=character]{character(1)}}\cr part of the SQL query (starting
from
WHERE) used to subset the input geometries, for example \code{"where NAME_0
  = 'Estonia'"}. The first part of the query (where the layer is defined) is
derived from the meta-data of the currently handled geometry.}

\item{thresh}{\code{\link[=integer]{integerish(1)}}\cr percent value of overlap below
which two geometries (the input and the base) are considered to be the
same. This is required, because often the polygons from different sources,
albeit describing the same territorial unit, aren't completely the same.}

\item{beep}{\code{\link[=integer]{integerish(1)}}\cr Number specifying what sound to be
played to signal the user that a point of interaction is reached by the
program, see \code{\link[beepr]{beep}}.}

\item{simplify}{\code{\link[=logical]{logical(1)}}\cr whether or not to simplify
geometries.}

\item{stringdist}{\code{\link[=logical]{logical(1)}}\cr whether or not to use string
distance to find matches (should not be used for large datasets/when a
memory error is shown).}

\item{strictMatch}{\code{\link[=logical]{logical(1)}}\cr whether or not matches are
strict, i.e., there should be clear one-to-one relationships and no changes
in broader concepts.}

\item{verbose}{\code{\link[=logical]{logical(1)}}\cr be verbose about what is happening
(default \code{FALSE}). Furthermore, you can use
\code{\link{suppressMessages}} to make this function completely silent.}
}
\value{
This function harmonises and integrates so far unprocessed geometries
at stage two into stage three of the geospatial database. It produces for
each main polygon (e.g. nation) in the registered geometries a spatial file
of the specified file-type.
}
\description{
Harmonise and integrate geometries into a standardised format
}
\details{
To normalise geometries, this function proceeds as follows:
\enumerate{ \item Read in \code{input} and extract initial metadata from
the file name. \item In case filters are set, the new geometry is filtered
by those. \item The territorial names are matched with the gazetteer to
harmonise new territorial names (at this step, the function might ask the
user to edit the file 'matching.csv' to align new names with already
harmonised names). \item Loop through every nation potentially included in
the file that shall be processed and carry out the following steps:
\itemize{ \item In case the geometries are provided as a list of simple
feature POLYGONS, they are dissolved into a single MULTIPOLYGON per main
polygon. \item In case the nation to which a geometry belongs has not yet
been created at stage three, the following steps are carried out:
\enumerate{ \item Store the current geometry as basis of the respective
level (the user needs to make sure that all following levels of the same
dataseries are perfectly nested into those parent territories, for example
by using the GADM dataset) } \item In case the nation to which the geometry
belongs has already been created, the following steps are carried out:
\enumerate{ \item Check whether the new geometries have the same coordinate
reference system as the already existing database and re-project the new
geometries if this is not the case. \item Check whether all new geometries
are already exactly matched spatially and stop if that is the case. \item
Check whether the new geometries are all within the already defined
parents, and save those that are not as a new geometry. \item Calculate
spatial overlap and distinguish the geometries into those that overlap with
more and those with less than \code{thresh}. \item For all units that dName
match, copy gazID from the geometries they overlap. \item For all units
that dName not match, rebuild metadata and a new gazID.} \item store the
processed geometry at stage three.} \item Move the geometry to the folder
'/processed', if it is fully processed.}
}
\examples{
if(dev.interactive()){
  library(sf)

  # build the example database
  adb_example(until = "regGeometry", path = tempdir())

  # normalise all geometries ...
  normGeometry(pattern = "estonia")

  # ... and check the result
  st_layers(paste0(tempdir(), "/geometries/stage3/Estonia.gpkg"))
  output <- st_read(paste0(tempdir(), "/geometries/stage3/Estonia.gpkg"))
}
}
\seealso{
Other normalise functions: 
\code{\link{normTable}()}
}
\concept{normalise functions}
