Keynote Talks

09:00-09:45 : Storage and Indexing of Spatial data on HDFS -- Dr. Siva Ravada (Oracle)


Spatial Hadoop is gaining popularity as a big data platform for processing large volumes of Spatial data. But many real-world applications encounter performance and scalability problems while porting their applications to Spatial Hadoop. One of the main reasons for this is the lack of automatic data partitioning and distribution mechanism on HDFS for Spatial data. Ideas from the previous research on shared nothing models of parallel computing can be applied to the current Hadoop systems to improve the performance and throughput of Spatial applications using HDFS storage. Since the network latency between different nodes of Hadoop are different for different systems (infini band to ethernet), these techniques need to be adopted according to the system.

Current Spatial applications deal with a vast variety of data including Raster, Vector, and real-time sensor data. Storage models for data depend on the type of the data and nature of the application. This talk will present some of the common problems encountered in storing Spatial data on HDFS and offer some initial thoughts on clustering and declustering of data on HDFS for both Raster and Vector data types. The goal of this talk is to start the discussion on the storage models for Hadoop so that the storage can be automatically handled by the Spatial Hadoop systems and leave the application developers to focus on the analysis algorithms.

Bio: Dr. Siva Ravada is Director of Development for Oracle’s Spatial technologies. Siva joined Oracle in 1997, after receiving a Ph.D. in Computer Science from the University of Minnesota, with a specialty in Spatial Technology for High Performance GIS Computing Environments. Siva is one of the founding members of the development of Spatial technologies in Oracle. He currently manages the Spatial and MapViewer development teams. He holds numerous patents and has several publications in peer reviewed journals and conferences.

11:20-12:05 : Big Data and Analytics, a Geospatial Industrial Perspective -- Dr. Erik G. Hoel (ESRI)

Abstract: This talk will focus on our current research on distributed storage and processing of large volumes of spatial data. We will highlight GIS Tools for Hadoop, Esri’s open-source spatial framework for Hadoop. In addition, we will discuss our current thinking on how to enable users and organizations with big spatial data problems to manage, visualize, and run analysis in order to discover meaningful and valuable content in their sea of data. The focus is not on providing SDKs that only developers can utilize, but rather on how to extend familiar user experiences to exploit big data stored on clusters (including HDFS), as well as traditional databases and NoSQL data stores. We will also discuss some of the significant emerging trends as big data moves into its second generation; i.e., beyond the traditional MapReduce programming model.

Bio: Dr. Hoel is a Computer Scientist working in the Software Research and Development Division of Esri. Dr. Hoel is the development lead of the Geodata Management Group. This group is responsible for the information and transaction models for the ArcGIS platform. He joined Esri shortly after obtaining his Ph.D. and M.S. in Computer Science from the University of Maryland with a focus on parallel spatial indexing and analysis, and his B.A. in Computer Science and Statistics from the University of California at Berkeley. His current research interests include big data analytics, network models for utilities, spatio-temporal information systems, mobile computing, distributed and parallel processing, and geostatistical analysis. Dr. Hoel has over 25 years of experience in the computer software industry and has authored more than 35 publications.

13:30-14:15 : SpatialHadoop: A MapReduce Framework for Spatial Data -- Professor Mohamed Mokbel (University of Minnesota)

Abstract: This talk is about SpatialHadoop; a full-fledged MapReduce framework with native support for spatial data. SpatialHadoop is a comprehensive extension to Hadoop that injects spatial data awareness in each Hadoop layer, namely, the language, storage, MapReduce, and operations layers. In the language layer, SpatialHadoop adds a simple and expressive high level language for spatial data types and operations. In the storage layer, SpatialHadoop adapts traditional spatial index structures, Grid, R-tree and R+-tree, to form a two-level spatial index. SpatialHadoop enriches the MapReduce layer by new components for efficient and scalable spatial data processing. In the operations layer, SpatialHadoop is already equipped with three basic operations, range query, kNN, and spatial join as case studies. Other spatial operations can also be added following a similar approach. The talk will also discuss various active projects for big spatial data that take advantage of SpatialHadoop.

Bio: Dr. Mohamed Mokbel is an Associate Professor at the Department of Computer Science and Engineering, University of Minnesota, as well as the founding Technical Director of KACST GIS Technology Innovation Center, Umm Al-Qura University, Saudi Arabia. He obtained his Ph.D. from Purdue University in Lafayette, Indiana, and his MSc. and BSc. from Alexandria University. His main research interests include database systems, cloud computing, location-based services, and GIS and his research has been published in over 130 publications. Dr. Mokbel also received the US National Science Foundation (NSF) Career Award in 2010, the most prestigious award given to junior faculty by the NSF.

New Document