GIS product developer with a background in geography, information retrieval, spatial cognition, and open science/data platforms.
GTU Geography Honors Society
UMBC GIScience and Web Development Outstanding Senior
NSF Research Experience for Undergraduates Scholar
UMBC Undergraduate Research Award
Geography Ph.D. (Candidate)
Geographic information retrieval
Information Systems B.S.; Geography B.A.
GIScience; Web Development
Grade: GPA: 3.83 / 4.0
This study surveys the state of search on open geospatial data portals. We seek to understand 1) what users are able to control when searching for geospatial data, 2) how these portals process and interpret a user's query, and 3) if and how user query reformulations alter search results. We find that most users initiate a search using a text input and several pre-created facets (such as a filter for tags or format). Some portals supply a map-view of data or topic explorers. To process and interpret queries, most portals use a vertical full-text search engine like Apache Solr to query data from a content-management system like CKAN. When processing queries, most portals initially filter results and then rank the remaining results using a common keyword frequency relevance metric (e.g., TF-IDF). Some portals use query expansion. We identify and discuss several recurring usability constraints across portals. For example, users are typically only given text lists to interact with search results. Furthermore, ranking is rarely extended beyond syntactic comparison of keyword similarity. We discuss several avenues for improving search for geospatial data including alternative interfaces and query processing pipelines.
Location data from social network posts are attractive for answering all sorts of questions by spatial analysis. However, it is often unclear what this information locates. Is it a point of interest (POI), the device at the time of posting, or something else? As a result, locational references in posts may get misinterpreted. For example, a restaurant check-in on Facebook only locates that POI. But, check-ins have been used to locate their poster, their poster's home, or where the posting event occurred. Furthermore, post metadata terms like place and location are ambiguous, making information integration difficult. Consequently, analysts may not be using the correct locational references pertinent to their questions. In this paper, we attempt to clarify and systematize what can be located within social network post metadata. We examine locational references in post metadata documentation from several social networks. We identify three common groups of locatable things: places recorded in a poster's profile, devices, and points of interest. We posit that these groups can be described using The World Wide Web Consortium's (W3C) provenance ontology (PROV) - in particular, PROV's agent, activity, and entity concepts. Next, we encode example post metadata with these descriptions, and show how they support answering questions such as which country's citizens take the most Flickr photos of the Eiffel Tower? The theoretical contribution of this work is a taxonomy of locatable things derived from social network posts, and a tool-supported method for describing them to users.
Access to public data in the United States and elsewhere has steadily increased as governments have launched geospatially-enabled web portals like Socrata, CKAN, and Esri Hub. However, data discovery in these portals remains a challenge for the average user. Differences between users' colloquial search terms and authoritative metadata impede data discovery. For example, a motivated user with expertise can leverage valuable public data about transportation, real estate values, and crime, yet it remains difficult for the average user to discover and leverage data. To close this gap, community dashboards that use public data are being developed to track initiatives for public consumption; however, dashboards still require users to discover and interpret data. Alternatively, local governments are now developing data discovery systems that use voice assistants like Amazon Alexa and Google Home as conversational interfaces to public data portals. We explore these emerging technologies, examining the application areas they are designed to address and the degree to which they currently leverage existing open public geospatial data. In the context of ongoing technological advances, we envision using core concepts of spatial information to organize the geospatial themes of data exposed through voice assistant applications. This will allow us to curate them for improved discovery, ultimately supporting more meaningful user questions and their translation into spatial computations.
We investigate the relations between human cognitive scales and spatial information. To help organize spatial information, particularly around how humans perceive and interact with spaces around them, we explore the intersection of Kuhn's (2012) spatial information taxonomy, and Montello (1993) spatial scale taxonomy. We discuss results and challenges while using this intersection to categorize phenomena from an earthquake case study.
Farmers face pressure to respond to unpredictable weather, the spread of pests, and other variable events on their farms. This paper proposes a framework for data aggregation from diverse sources that extracts named places impacted by events relevant to agricultural practices. Our vision is to couple natural language processing, geocoding, and existing geographic information retrieval techniques to increase the value of already-available data through aggregation, filtering, validation, and notifications, helping farmers make timely and informed decisions with greater ease.
The often fragmented process of online spatial data retrieval remains a barrier to domain scientists interested in spatial analysis.Although there is a wealth of hidden spatial information online, scientists without prior experience querying web APIs (Application Programming Interface) or scraping web documents cannot extract this potentially valuable implicit information across a growing number of sources. In an attempt to broaden the spectrum of exploitable spatial data sources, this paper proposes an extensible, locational reference deriving model that shifts extraction and encoding logic from the user to a preprocessing mediation layer. To implement this, we develop a user interface that: collects data through web APIs and scrapers, determines locational reference as geometries, and re-encodes the data as explicit spatial information, usable with spatial analysis tools, such as those in R or ArcGIS.
We explore the idea of spatial lenses as pieces of software interpreting data sets in a particular spatial view of an environment. The lenses serve to prepare the data sets for subsequent analysis in that view. Examples include a network lens to view places in a literary text, or a field lens to interpret pharmacy sales in terms of seasonal allergy risks. The theory underlying these lenses is that of core concepts of spatial information, but here we exploit how these concepts enhance the usability of data rather than that of systems. Spatial lenses also supply transformations between multiple views of an environment, for example, between field and object views. They lift these transformations from the level of data format conversions to that of understanding an environment in multiple ways. In software engineering terms, spatial lenses are defined by constructors, generating instances of core concept representations from spatial data sets. Deployed as web services or libraries, spatial lenses would make larger varieties of data sets amenable to mapping and spatial analysis, compared to today's situation, where file formats determine and limit what one can do. To illustrate and evaluate the idea of spatial lenses, we present a set of experimental lenses, implemented in a variety of languages, and test them with a variety of data sets, some of them non-spatial.