Computer vision systems that can interpret images from street cameras

Computer vision
Researchers are developing algorithms capable of monitoring images from security cameras and identifying the occurrence of incidents in urban environments (image: Camerite/ Avenida Paulista)

More than 1 million security cameras are in operation in the city of São Paulo according to ABESE, the Brazilian electronic security association.

At present, there is no way to analyze the images captured by these cameras to detect abnormal incidents or behavior, for example, and issue real-time alerts. However, this could be performed by computers in the near future.

To realize this goal, a group of researchers from the Department of Computer Science at the University of São Paulo’s Mathematics & Statistics Institute (IMEUSP) in Brazil, in collaboration with colleagues at New York University and IBM’s T. J. Watson Research Center in the United States, have set out to enhance existing computer vision techniques and develop new techniques that will enable computer systems to interpret images acquired by video cameras.

As part of the FAPESP-funded Thematic Project “Models and methods of e-Science for life and agricultural sciences”, a database administration platform has been developed to store images of urban environments acquired by street cameras or posted to the internet by service providers like Camerite.

“Our goal is to accumulate these urban images and store them in multi-terabyte databases that can be used to develop algorithms capable of analyzing large volumes of data to identify behavioral patterns,” said Roberto Marcondes Cesar Junior, a professor at IMEUSP and principal investigator for the project, in an interview with Agência FAPESP.

According to Cesar Junior, the computer vision platforms developed by different groups around the world, including the researchers involved in his project, are already capable of identifying individuals in images acquired from cameras, as well as locating body parts such as hands and capturing movement.

The researchers now aim to enhance existing algorithms or develop new ones that can identify what individuals or groups are doing in stored images or in real time.

“We plan to create algorithms that can interpret situations with a higher degree of abstraction than finding a person, car or building in an image,” Cesar Junior said. “That means semantically more complex queries, such as whether a person is standing still or moving, using a cell phone, joining or leaving a group of people, and so on.”

Algorithms like these could use their interpretations of people’s behavior in an image to infer the likelihood of traffic accidents, congestion or road closure, he added.

“Real-time monitoring of security cameras by computer algorithms would detect traffic accidents more quickly, for example, instantly alerting law enforcement and paramedics to provide the necessary assistance and restore traffic flows,” Cesar Junior said.

Storm damage

One of the possible applications of computer vision predicted by the researchers is the detection of storm damage and incidents such as vehicle collisions, fallen trees and flooding caused by torrential tropical downpours.

According to Cesar Junior, computer vision algorithms in service today can identify people, automotive vehicles and buildings in camera images under normal weather conditions, but in heavy rain, they typically fail to do so.

“Street camera image quality degrades very quickly when it rains because of changes in lighting conditions and increasing noise. It’s much harder for algorithms to identify people, buildings and vehicles in heavy rain for this reason,” he said.

“So we’re working on enhanced algorithms that can not only identify elements in a scene when it’s raining but also detect traffic collisions, for example, which tend to occur more frequently in rainy conditions.”

To upgrade existing algorithms and create new ones, the researchers have programmed the software to collect images from street cameras available on the internet when it is raining in São Paulo. The software identifies rainy conditions from street camera images acquired by Camerite and other apps using information from weather services such as those provided by the Center for Weather Forecasting & Climate Studies (CPTEC) at Brazil’s National Space Research Institute (INPE) and Climatempo.

When the platform identifies rain in a neighborhood using the data feeds from these weather services, it starts to collect and store images from street cameras in the area automatically, Cesar Junior explained.

“It would be impossible for human beings to do this,” he said. “A person would at most be able to look at the images acquired during a day by one street camera and identify the time at which it rained, but no human being could monitor a month’s worth of images from thousands of cameras all over the city, for example.”

Computer vision technology depends to a great extent on the accumulation of data, he went on, because algorithms learn statistically. The larger the volume of data to analyze, the better the computational performance.

“State-of-the-art algorithms today, such as those used by Facebook and Google, for example, performed very poorly 15 or 20 years ago because there wasn’t much data,” Cesar Junior said. “Today, they’re unbeatable because of the massive volume of data available.”