Basic information

Title: Video Surveillance for Impaired persONs
Acronym: VISION
Company: HTLab
Trial/Replicator: Trial
Type: Start-up

Where: Bristol (UK)
When (time-plan): January to June 2020 (6 months)

Project summary

The objective of VISION is to develop a video surveillance framework for smart cities, to be integrated in the FLAME platform, aimed at providing a social and shared video surveillance tool to help impaired people (blind people, people with limited-mobility, deaf people, old people) which could need to be guided/monitored safely within the city. VISION is composed of three architectural elements, specifically the DiMoVis platform, Tourist Eyes and the Lying Person Recognition element.

Figure 1. ViSiOn Architecture and its Interaction with the Underlying FLAME Platform
Figure 1. ViSiOn Architecture and its Interaction with the Underlying FLAME Platform

The DiMoVis platform is a smart, flexible and social video surveillance platform that, when deployed on the FLAME platform, also becomes highly scalable in terms of number of transmitting and receiving devices.
Each user of the platform can easily share his own video source device (e.g. webcam, IP cam, smartphone or tablet) by connecting it to the network (with either a WiFi or a cellular connection), and register it to the DiMoVis platform through an Android app or a web portal. During registration, the owner of the video source device can specify one or more access profiles to restrict access to different groups of users. Each user, on the other side, through the same Android app or web portal, has a map of the area covered by the DiMoVis platform service (e.g. a smart city), with all the active cameras represented with a green circle, as shown in the figure below.

Figure 2. Graphic User Interface of the VISION App
Figure 2. Graphic User Interface of the VISION App

Association of one or more cameras is done by a registered user in a very easy way, by clicking on the map viewed on the screen, or through a QR-code that is available in proximity of the camera. The user can then customize the received video flows and the events associated with each camera by adding optional functions running in the underlying softwarized network as virtual functions (VF). For example, the user can request to be alerted in case of a motion detection from a specific camera, or can add the mosaic view of video streams, conveying them in a single frame, or can request video transcoding to adapt video streams to the specific playout device he/she is using. These functions will be automatically included in the chains between video sources and the device of the requesting users.

The Impaired Person Support element included in DiMoViS aims at providing support to impaired persons to be remotely guided by a relative or a friend (in the following called remote guide), for example when he/she has lost his spatial reference points or if he/she wants to reach an unknown place. More specifically, the impaired person, walking in an area covered by the VISION service, can request a remote visual guide. To this purpose, he/she starts a phone call with the remote guide, informing him/her about his approximate position. Then the remote guide activates the VISION app and, by browsing on the map, selects all the cameras that are available in the zone where the impaired person might be. Then, looking at the video flows received by these cameras, he/she starts to guide the person by speaking on the phone with him. The remote guide can follow the person during his path by means of different video flows transmitted by various cameras, continuing to provide instructions until he reaches the desired place.

Figure 3. Impaired Person Support service element
Figure 3. Impaired Person Support service element

Two extensions of DiMoViS, included in the VISION platform, are Tourist Eyes and Lying Person Recognition.

The objective of Tourist Eyes is to provide blind tourists visiting a smart city with a framework for supporting their activities, in both outdoor and indoor environments. Thus, the goal of this service element is using 5G-compliant frameworks and some IP cameras installed on pre-defined touristic paths to guide blind tourist people to move in smart cities. Blind tourists, through voice commands issued by an appropriate smartphone app, request indications to reach some points-of-interest (PoI) belonging to predefined paths, like restaurants, restrooms, museums, ATM machines. Then, blind tourists are localized by the system thanks to a wearable hat with a specific color, and receive back sound messages from the Tourist Eyes element through their smartphone device to follow the selected path.

This service is complementary to the Tourist Eyes block, since it can be invoked either when the impaired tourist is on a path registered in the Tourist Eyes platform and needs to reach a place not included in the PoI list, or when he needs to reach a Tourist Eyes path from an external place. All the IP cameras that are installed for the Tourist Eyes service can be associated with the DiMoVis platform, and so can be used by all the services provided by it.

Figure 4. Tourist Eyes service element
Figure 4. Tourist Eyes service element

The Lying Person Recognition element, thanks to the installed cameras, can also automatically recognize if a person is lying on the ground in indoor or outdoor environments, and send an alert to someone (e.g. a friend or a relative of that person, or to the local emergency or healthcare provider). For example, this service can be activated by a relative of an old person if he/she wants to monitor the old person. In this case, video flows generated by all the IP cameras installed close to the places where the impaired person moves, are sent to an artificial intelligence tool running inside the network as a virtual function (VF), which is able to recognize persons lying on the ground through an image recognition algorithm. If a person is realized to lay down in an area served by a camera where the person monitored also through the Impaired Person Support element is located, an alert will be sent to the relative who requested to monitor the impaired person.

Figure 5. Lying Person Recognition service element
Figure 5. Lying Person Recognition service element

The VISION platform is completed with the external devices, represented by the user-provided IP cameras and the mobile devices, both connected via WiFi to the FLAME platform. Video sources, on the one hand, are any kind of video transmitting devices, like IP cameras, action cams, cameras installed on-board drones, etc. In the experiment, in order to be compliant with the rest of the platform, we will use WiFi IP cameras transmitting HTTP-based flows or, alternatively, any kind of cameras whose flows are transcoded to this format with a software running on a Raspberry Pi board attached to the IP camera. Mobile devices, on the other hand, are Android smartphones or tablets where the VISION app will be installed.