Information generated on social media sites such as Twitter, Facebook, Flickr, and Instagram are fast becoming powerful and ubiquitous new sources of time-critical data needed to aid decision making during extreme weather events and emergency situations , . The large-scale spatio-temporal data harvested from these social media sites during past events such as the 2010 earthquake in Haiti , the 2011 tsunami in Japan , and the 2010–2011 Queensland, Australia, flood  attest to the fact that real-time crowd-sourced disaster information has huge potential to enable decision makers and first responders to detect disruptive events quickly, gain situational awareness, and respond swiftly to unfolding emergency situations.
This type of content generated on social media sites using mobile devices with geospatial information has been termed Volunteered Geographical Information (VGI) . As promising as this VGI data collection initiative may appear, the process involved in harnessing social media data for disaster management is not straightforward; it often requires a carefully designed solution.
One issue is how to improve the objectivity of data contributed by users.
One of such solutions is the open-source and cloud-based Cognicity software that was developed as part of the PetaJakarta.org project . The PetaJakarta.org project is an urban data collection initiative developed by the SMART Infrastructure Facility (University of Wollongong, Australia) in collaboration with the Jakarta Emergency Service (BPBD DKI Jakarta) and Twitter Inc. As part of the PetaJakarta data collection campaign, Twitter users in the city of Jakarta are encouraged to share tweets about flood conditions in their localities, preferably ones with embedded photos and text descriptions of water level. Cognicity then parses through the incoming torrent of tweets, filtering flood-related tweets by the keyword, “flood,” or “banjir” in Bahasa, an Indonesian language. The system differentiates geo-located and non-geo-located tweets and also enables a two-way communication, wherein users can confirm if they are actually reporting about an ongoing flood situation.
If social media data is to be used in facilitating crisis communication and situational awareness, a lot of useful information can be lost by failing to consider the sentiments, contextual information, and language-specific operations associated with disaster related reports posted on social media sites by users.
After gathering and sorting the data, the confirmed geo-located tweets are used to generate a near real-time map of flood conditions in various parts of the city . This publicly available city-scale flood map guides citizens to navigate safely through the city and also enables emergency services (BPBD DKI Jakarta) to plan and coordinate operational activities. While this project has recorded huge success over the last few years of operation, there remain some “wicked problems” that need to be addressed going forward.
One such issue is how to improve the objectivity of data contributed by users. Users’ opinions about flood conditions and the way they report those opinions can vary markedly, depending on factors such as level of education, motivation, device functionalities, personality, system constraints (e.g., Twitter 140-character limit), etc. This is a major problem when trying to estimate water level information needed for flood maps. For example, what could be the most objective way for a user to accurately report the flood height in a given location? Clearly, measurements associated with user’s height (e.g,. water at knee level) are highly subjective as users’ height may vary. So then, could water level measurement against landmarks be a more reliable approach, taking into consideration the lack of accurate and high resolution elevation data in most flood prone cities situated in developing nations?
The second issue relates to the communicative aspect of social media. If social media data is to be used in facilitating crisis communication and situational awareness, a lot of useful information can be lost by failing to consider the sentiments, contextual information, and language-specific operations associated with disaster related reports posted on social media sites by users. This aspect is often lacking in solutions. Future solutions can contribute immensely by addressing these two wicked problems, along with other important issues such as incentive mechanisms for data contributors and robust confirmatory process for selecting accurate and reliable data .