Visual-Assistant 👁️

Visual Assistant is an advanced real-time interface that enables scene understanding using machine learning and image processing algorithms applied to live video. This tool is primarily designed to assist visually impaired individuals by providing auditory feedback about their environment. By analyzing the visual data in real-time, the application generates descriptions of the surrounding environment, thus allowing users to gain situational awareness without direct visual input or assistance from others.

The application is designed to run on mobile devices and delivers speech output that describes detected objects, their relative positions, and distances from the user. Moreover, the system can infer environmental context and suggest possible actions, making it a comprehensive tool for users to interact with and understand their surroundings. In addition to aiding visually impaired individuals, this system can be integrated into various applications, including robotics and autonomous vehicles, for efficient environmental perception and object localization.

Concept

Motivation

This project aims to contribute positively to society by improving the daily lives of individuals who are visually impaired. By providing a tool that enables these individuals to better understand their surroundings, we can potentially reduce the risk of accidents and increase their autonomy.

Furthermore, this technology holds promise for broader applications in various industries, including robotics, autonomous vehicles, and smart systems, where real-time environmental recognition and situational awareness are critical.

Technical Details

Visual Assistant is an Android-based application that uses computer vision techniques to identify objects in real time and determine their relative positions to the user. By leveraging state-of-the-art deep learning models, the system provides speech output that describes detected objects and their distances from the user.

For example, if a user is in a room with a cup 3 feet away and a chair 5.5 feet away, the system will notify them with audio feedback: "There’s a cup 3 feet away from you, and a chair 5.5 feet away."

In addition to object localization, the system can also recognize and describe the broader environment. For example, if the user is sitting in a classroom, the system will detect the context and inform the user, "You are in a classroom."

The system is also adaptable for integration into larger systems that require real-time object detection and environmental awareness. Potential applications include autonomous vehicles, robotics, and social media platforms like Facebook and Instagram, where environmental recognition and interaction are essential.

Architecture

Trained Models

Model Files

Path: backend/Classes/utills/model.11-0.6262.hdf5
YOLO Weights: backend/Classes/utills/yolov3.weights

Pre-trained Model Resources:

Download Models

References

Joseph Redmon, Ali Farhadi. YOLOv3: An Incremental Improvement Link
Keras Documentation: InceptionResNetV2 Link
Research Paper: Analysis of Blind Pedestrian Deaths and Injuries from Motor Vehicle Crashes Link
Open Images Dataset Link
Arun Ponnusamy. High-Level Computer Vision Library for Python Link
MIT Indoor Dataset Link

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
backend		backend
images		images
visual_assistant		visual_assistant
.gitignore		.gitignore
MIT-LICENSE.txt		MIT-LICENSE.txt
README.md		README.md
db.sqlite3		db.sqlite3
manage.py		manage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visual-Assistant 👁️

Concept

Motivation

Technical Details

Architecture

Trained Models

Model Files

Pre-trained Model Resources:

References

Technologies

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

hamzafer/Visual-Assistant

Folders and files

Latest commit

History

Repository files navigation

Visual-Assistant 👁️

Concept

Motivation

Technical Details

Architecture

Trained Models

Model Files

Pre-trained Model Resources:

References

Technologies

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages