Mission

Harnessing smartphone capabilities (sensors, performance) and modern browser features (platform independence, connectivity, AI) for assistive technology.

A concept and libraries for affordable image processing that enables extraction of facial and body position data, which provide textual outputs via an API for extensive use in assistive technologies and other applications.

I want to quickly see examples

Problem

A variety of specific tasks.

Developing assistive technologies is expensive because each individual with a disability has unique needs.

For example, in patients with locked-in syndrome (who cannot communicate directly), personalized movement options such as using the mouth, hands, or similar methods can be identified. Applications must also consider specific requirements, including limitations in movement range, involuntary movements (tics), and other factors.

Also, specialized hardware for image processing may not be inexpensive.

Solution

Let’s break down the many specialized tasks into their common components and remaining parts. For hardware, we will use devices that users already own or can acquire affordably.

Specifically, this means:
  • We will use unified software to capture data on facial and body positions, gestures, sounds, etc.
  • This software will run on a smartphone, thus utilizing its processor for image and sound processing, as well as its camera and/or microphone. Additional sensors like GPS, accelerometer, gyroscope, light sensors, etc., can potentially be used.
  • We will define an API (Application Programming Interface) that other developers can use to create their applications. The API will provide a stream of textual data.
  • We will present several demo applications to demonstrate the full concept of using API data.
A mobile application provides textual data to other applications.

Safety and Ethics

We never transmit video and audio data between applications. In short, the same data is transmitted whether you are clothed or not.

We never transmit video and audio data between applications.

All data is transmitted only as generic textual information. For example, 2D coordinates of individual hand joints or a textual description of a gesture (e.g. "open palm" or "fist").

The aim of the project is to publish the entire API (i.e. a detailed description of the data format) – ready for user applications. Anyone can verify what information is being transmitted.

In short, this means that the same data is transmitted whether you are clothed or not.

If you still do not trust the application, you can test it in a safe environment and then deny it access to your camera and microphone.

Is a smartphone enough?

A smartphone is a very powerful computer that contains all the necessary components for image and sound processing. It also has sensors such as GPS, accelerometer, gyroscope, light sensors, etc.

A smartphone is affordable and versatile. Most people already own one and use it for many purposes. Therefore, there is no need to purchase additional specialized hardware.

It is reasonable to expect that both cameras and processors (with AI support) will continue to improve.

If for some application a smartphone proves insufficient, it is possible to use it to fine-tune the application and then migrate it to specialized hardware. As long as this hardware (with any camera) supports modern web technologies, everything will work without modifications.

Similarly to a smartphone, a laptop or desktop computer with a camera can also serve this purpose.

Project Status

The project is in its early development stages. We plan the following steps

Supported Programming Languages

  • JavaScript - for web applications
  • C++ Arduino - for embedded devices (with microprocessors)
  • Python - for a wide range of (desktop) applications

Communication API

We intend to design an API that defines the format of transmitted data and commands for communication between the “transmitter” (smartphone) and the “receiver” (application). An abstract description format will be used to generate the corresponding libraries in the aforementioned programming languages.

Communication Channels

In the above programming languages, we also plan to implement parts of libraries to support communication between the “transmitter” and the “receiver” for the following connection methods:

  • WebRTC (PeerJs) for web applications. All modern browsers support this technology.
  • Bluetooth for embedded devices. (We will use Bluetooth Low Energy (BLE).)
  • Socket for communication with a server (via public WiFi or a WiFi hotspot application).

Application Examples - Demos

To demonstrate the concept and libraries, we have prepared several demo applications. View


Contacts

For more information, please contact astriot[at]marvan.cz
LinkedIn: Ivo Marvan