ббк76. 0 Қ 54 Редакционная коллегия

жүктеу/скачать 14,62 Mb.

Pdf көрінісі

бет	7/57
Дата	03.03.2017
өлшемі	14,62 Mb.
	#5946

1 2 3 4 5 6 7 8 9 10 ... 57

Master side
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ»
SHAYAKHMETOVA A.B., POLICHSHUK Y.V. EMOTION RECOGNITION BASED ON IMAGE PROCESSING
Keywords
Face detection and feature extraction
Used Database, Evaluation and Conclusion

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

Fig. 1 - A force feedback device called The PHANToM by SensAble Technologies Inc

Haptic Robots such as the PHANToM was considered as a starting point of brainstorming in

order to decide what design to use, what interface to create, and what features to add. Concisely,

some users do not notice meaningful differences in hardness in an experiment where the users

recognize the hardness of an object or the workspace size although those users manipulate different

types of haptic interface devices [6]. PHANToM is a popular linkage-based haptic device (fig. 1).

The position and orientation of the pen are tracked through encoders in the robotic arm. Three

degrees of force, in the x, y and z, directions are achieved through motors that apply torques at each

joint in the robotic arm. The arm tracks the position of the pen (end-effector) and as a result, it is

required to determine the proper joint angles and torques necessary to exert a single point of force

on the tip of the pen. An alternative to a linkage-based device is one that is tension-based. Instead of

applying force through links, cables are connected to the point of contact in order to exert a force

[7]. Encoders determine the length of each cable. From this information, the position of a gripper

can be determined. Motors are used to create tension in the cables, which results in an applied force

at the grip. The pen as the end effector is a convenient way to perform manipulation tasks on the

robot, depending on the environment it is going to be used in. Our robot provides 3 DOF that we are

able to measure, but the end-effector is connected through a spherical base, providing another 3

DOF but this time, only to bring the joints to proper movement.

Robot motion of teleoperated systems is usually controlled by system operators with the

help of a camera mounted on robot or inspecting the area from above. Although vision systems

provide much information of the environment, they require much attention from the operator. To

overcome this problem, haptic devices provide operators with the additional sense of feeling the

robot workspace, thus making it easier to avoid obstacles and reducing the average number of

collisions. Usually, operators have to drive manually the mobile robot through obstacles by

explicitly specifying the robots angular and linear velocity. By doing so, they are fully in charge of

the robot motion and as a clear viewpoint of the robot environment may sometimes not be available,

they could accidentally drive the robot to collisions or choose longer paths than optimal ones. One

possible way to resolve this might be using a vibrating motor to give a force feedback to the

operator in order to notify them without blocking their visual or audial perception during the work.

The mobile robot haptic teleoperation system consists of two sides: the Master side, which contains

the haptic device and the master station with the map-building module and the Slave side, which

contains the mobile robot and a slave robot server with the behavior and the localization module for

the exploration of the unknown environment [8]. According to A.Tatematsu and Y. Ishibashi [6],

the efficiency of systems work is higher in the case where the workspace is uniformly mapped to

the virtual space in the directions of the x, y, and z-axes than in the case where the workspace is

individually mapped to the virtual space in the direction of each axis so that the mapped workspace

size corresponds to the virtual space size.

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

II. Interface design

Our research is devoted to developing the principles and tools necessary for the realization of the

advanced robotic and human-machine systems capable of haptic interaction. Considering a certain

type of sensors, it is necessary to focus on ACP Single-Turn Trimmer potentiometer CA9H5 1K

with shaft (fig. 2). There is number of terms in the electronics industry used to describe certain

types of potentiometers, where a trimmer pot is included. It can be described as a trimming

potentiometer which is adjusted once or infrequently for ”fine-tuning” of an electrical signal. The

user in order to move the robot should perform a hand movement. This different movement is

sensed by potentiometer attached to each joint. Output current should be monitored continuously to

protect drive mechanics and motor; switch-off limit can be adapted by trimmer potentiometer to suit

individual drive used.

Fig. 2 - CA9H5 1K trimmer pot

Generally, a potentiometer can serve as a position sensor for the haptic feedback loop

controlling the motor. To test methodology, study and develop a haptic interface that can be used to

teleoperate complex anthropomorphic robotic systems in an intuitive way, one of the first

approaches has been accomplished. A Simulink model was created (fig. 3). Regarding this, Analog

Input block corresponds to Real-Time Windows target block in order to receive data from

potentiometers for each joint position. Scope block retranslates data to MATLAB Workspace. Math

Function block enables a user to convert the data from potentiometers mounted on joints to radians.

Fig. 3 - First Simulink model and Matlab-Potentiometer circuit

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

III. Experimental results and future work

After the hardware of the interface assembling and established the connection with the

software we run tests to observe the behavior of the system. Despite several errors that were then

adjusted the haptic interface simulation might to give good results in terms of overall performance.

That includes the response of the hardware to software and vice-versa.

Fig. 4. Model of Desktop Haptic Interface

Desktop Haptic Interface (fig. 4) would be able to read the signals sent from potentiometers

and responded within short amount of time. Then the virtual model of STAUBLI and a box

representing the obstacle can be added. Both systems and models can make correct movements

while being operated by the user. For testing whether it is even possible to use the output of the

haptic system as the input for the actual robot the data should be recorded in a separate file and be

sent it to the input system of STAUBLI. As a result, the robot responds accordingly, with the

vibrations and cautions related to obstacle being taken into account.

IV. Conclusion

In conclusion, this paper described a haptic teleoperation interface with further

implementations on robots similar to Staubli. The future work would include developing the other

three DOF and increase the size of our interface to fulfill the initial idea of a wearable design.

Also eliminating the DC Power Supply would make the system autonomous. Alternative design,

materials and components might be used. More other sensors and vibrating motors as force

feedback elements might be implemented into different parts of the human arm in order to collect

alternative data from each point and get a fuller image on how such system would perform and what

outputs it would bring.

A number of other improvements can enhance the system and open the doors to many other

implementations. Such systems are already being implemented in many fields such as Robotic

Surgery where high precision movement is the first necessity. In industry, the safety signals coming

from the vibrating motors would activate operators tactile sensations apart from the audial and

visual ones. To prevent the unwanted collisions the haptic interface like ours would provide the user

with involuntary muscle contractions and guide the operator to tilt the joint in other directions.

Video games industry might use such system to enhance the reality sensations of the gamer.

Our project is a small contribution towards a number of researches related to haptic

teleoperation interfaces that will lead to more interesting existing implementations or to those that,

hopefully, will be created in the nearest future.

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

References:

1. C. B. Zilles and J. K. Salisbury, “A constraint-based god-object method for haptic display,” in

Intelligent Robots and Systems 95.’Human Robot Interaction and Cooperative Robots’, Proceedings. 1995

IEEE/RSJ In- ternational Conference on, vol. 3. IEEE, 1995, pp. 146–151.

2. B. Horan, D. Creighton, S. Nahavandi, and M. Jamshidi, “Bilateral haptic teleoperation of an

articulated track mobile robot,” in System of Systems Engineering, 2007. SoSE’07. IEEE International

Conference on. IEEE, 2007, pp. 1–8.

3. J. C. Perry and J. Rosen, “Design of a 7 degree-of-freedom upper-limb powered exoskeleton,” in

Biomedical Robotics and Biomechatronics, 2006. BioRob 2006. The First IEEE/RAS-EMBS International

Conference on. IEEE, 2006, pp. 805–810.

4. T. Hayashi, H. Kawamoto, and Y. Sankai, “Control method of robot suit hal working as

operator’s muscle using biological and dynamical information,” in Intelligent Robots and Systems,

2005.(IROS 2005). 2005 IEEE/RSJ International Conference on. IEEE, 2005, pp. 3063–3068.

5. A. Schiele and F. C. van der Helm, “Kinematic design to improve ergonomics in human machine

interaction,” Neural Systems and Rehabil- itation Engineering, IEEE Transactions on, vol. 14, no. 4, pp.

456–469, 2006.

6. A. Tatematsu and Y. Ishibashi, Mapping workspaces to virtual space in work

usingheterogeneous haptic interface devices. INTECH Open Access Publisher, 2010.

7. J. J. Berkley, “Haptic devices,” White Paper, Mimic Technologies Inc., Seattle, 2003.

8. N. C. Mitsou, S. V. Velanas, and C. S. Tzafestas, “Visuo-haptic interface for teleoperation of

mobile robot exploration tasks,” in Robot and Human Interactive Communication, 2006. ROMAN 2006.

The 15th IEEE International Symposium on. IEEE, 2006, pp. 157–163.

UDС 004

SHAYAKHMETOVA A.B., POLICHSHUK Y.V.

EMOTION RECOGNITION BASED ON IMAGE PROCESSING

(Kazakh-British Technical University, Almaty, Kazakhstan)

Abstract

Facial expressions are an inconceivably imperative part of human life and basic research on

emotions of the previous couple of decades has delivered a few revelations that have prompted vital

true applications. People can embrace an facial expressions willfully or automatically, and the

neural components in charge of controlling the expression vary for every situation. However solid

expression acknowledgment by machine is still a challenge. This paper presents application of the

machine learning system of support vector machines (SVM) to recognition and classification of

human emotions based on image processing.

Keywords: Expression recognition, Face detection, SVM

Introduction

Exact recognition and classification of facial expressions ends up being an extremely

troublesome errand. Despite immense efforts in computer hardware and software development,

including the improvement of sophisticated algorithms for machine learning, today still no

computational system exists that approximates the performance of humans. Traditionally, facial

expressions have been studied by clinical and social psychologists, medical practitioners, actors and

artists. However in the last quarter of the 20th century, with the advances in the investigation of

artificial intelligence, computer vision and computer graphics, 3D modeling and computer scientists

started showing interest in the study of facial expressions.

Face detection and feature extraction

Automatic emotion acknowledgment frameworks are partitioned into three modules:

1. Face Recognition

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

2. Feature Extraction

3. Expression Classification

First of all, I detected face from loaded into matrix image using Viola-Jones object detection

framework. The Viola–Jones object identification system is the primary framework that introduced

powerful object detection rates created by Paul Viola and Michael Jones in 2001. Although it can be

trained to detect a variety of object classes. This algorithm is implemented in OpenCV library,

which was also included in my project. Cascade of supported classifiers working with haar - like

components is now prepared in OpenCV with a couple of hundred specimen perspectives of a

specific objects subjective pictures of the same size. After loading different pertained on face

detection cascades they can be used to a region of interest in an input image. To search for the

object in the entire picture one can move the inquiry window over the picture and check each area

utilizing the classifier. So to discover an object of an obscure size in the picture the scan procedure

have to be done a few times at various scales.

Figure 1 - The current algorithm uses the following Haar-like features.

Figure 2 - Example of work of face recognition using Haar-like features.

Second step is extracting features like mouth, eyes and nose. The principle of work remains

the same. Only using haar-cascades change according to the needed region of interest. Also image

of detected eye pairs and mouth of each testing picture is saved for further usage for training

classifier.

Support Vector Machine

Support Vector Machines are a maximal edge hyperplane classification method that relies

on results from statistical learning hypothesis to ensure high speculation execution. Kernel

functions are utilized to proficiently map input data which may not be linearly separable to a high

dimensional feature space where linear methods can then be applied. SVMs display great order

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

precision even when only a humble quantity of training data is prepared, making them especially

suitable to a dynamic, intelligent way to deal with expression acknowledgment.

Figure 3 - Recognition of face, eyes, mouth and nose in my project compared to non-face based picture.

The frequently unpretentious contrasts recognizing separate expressions, for example,

«anger» or «disgust» in our displacement-based data as well as the wide range of possible variations

in a particular expression when performed by different subjects drove us to the appropriation of

SVMs as the classifier of decision. Selection of an appropriate kernel function allowed further

modification and improvement of the SVM classifier to our specific space of outward appearance

acknowledgment. Support vector machines have already been effectively utilized in an assortment

of classification applications including character and text acknowledgment as well as DNA

microarray data analysis.

Used Database, Evaluation and Conclusion

All images with were used training my classifier were taken from Cohn-Kanade AU-Coded

Expression Database. The Cohn-Kanade Database is for exploration in programmed facial picture

investigation, synthesis and for perceptual studies. Mostly scientists define 7 universal facial

expressions: Happiness, Sadness, Surprise, Disgust, Fear, Anger and Neutral. My underlying

execution effectively perceived expressions in 70% of trials, with ensuing changes including

selection of a kernel function customized to the training data boosting acknowledgment precision

up to 85%. Consolidating further conceivable upgrades, for example, expanding measure of training

data or performing programmed SVM model selection is prone to yield far better execution and

further build the suitability of SVM-based expression acknowledgment approaches in building

emotional and socially intelligent human-computer interfaces.

Future work

Application of emotion recognition in different spheres of life is major and vital. First of all,

such technology will help in building human-like robots and human-computer based interactions

itself. Also it can find utilization in social sphere like monitoring mood of visitors of various

institutions, cafes, restaurants and anywhere else where profit of company depends on customers’

replies. Collecting such information can ameliorate customer service. Another way of usage of this

technology can be utilized in development of different applications or systems that can change,

brighten human spirits by listening appropriate music or film. Moreover acknowledgment of

microexpressions from live stream video can improve accuracy of lie detectors and can find its

adaptation law enforcement system.

References

1. P. Michel and R. Kaliouby, “Real Time Facial Expression

Recognition in Video Using Support Vector Machines”

2. V. Bettadapura, “Facial Expression Recognition and Analysis: The State of the Art”

3. P. Wagner, “Machine Learning with OpenCV2”

4. C. C. Chibelushi, F. Bourel, “Facial Expression Recognition: A Brief Tutorial Overview”

«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ

УДК 004.8

YESSENBAYEV ZH., KARABALAYEVA M.

A BASELINE SYSTEM FOR KAZAKH BROADCAST NEWS TRANSCRIPTION

(National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan)

Abstract

In this paper we describe our attempt to build a baseline system for Kazakh broadcast news

transcription. A neural network based acoustic model for the system was trained in Kaldi platform

using the previously available KazSpeechDB speech corpus and KazMedia speech corpus which is

the collection of the broadcast news from three different TV channels created specifically for this

task. A language model was trained in IRSTLM toolkit using mass media news available online.

The best word error rate of 4.06% was obtained for the Khabar channel.

Introduction

Nowadays, many media agencies produce and spread a large amount of audio and video

materials to the Internet, which often do not have the accompanying text description of the contents.

One of the main problems of the lack of texts or transcriptions to the audio and video materials is

the need to attract linguists or operators to recover the text, and it is accordingly entail additional

costs and time of organization. Consequently, the lack of text content of audio and video materials

effects on the poor quality of the search results of the news. This brings to the weak online presence

and lower ratings of media agencies compared to the foreign media and therefore the dissatisfaction

of Internet users. Furthermore, the absence of transcription limits impaired people or those who are

not able to hear from access to such content.

Our research aims to address the problem of transcribing the news in the Kazakh language,

using modern speech recognition technologies. Although the problem of broadcast news

transcription is well studied for foreign languages such as Arabic [1], English [2, 4], Chinese [3]

and others, it is very challenging problem due to the high variability of acoustic events in the data.

A common news track may contain acoustic segments with different speakers (male/female),

languages (Kazakh, Russian, English, etc.), channels (broadband/telephone), acoustic environments

(studio/outdoor) and noises (music, cars, etc.) which dramatically affect the accuracy of speech

recognition systems. Another challenge with respect to Kazakh is that many speakers are bilinguals

and mix Russian and Kazakh during the conversation.

In this work we present a baseline system for Kazakh broadcast news transcription. Here we

do not deal with the standard task such as segmentation and clustering of speech, language and

speaker identification and others, but we show our preliminary results on building and testing this

system on real data.

The following sections describe the speech corpus used for acoustic modelling, experiment

setup and the results of broadcast news transcription.

жүктеу/скачать 14,62 Mb.

Достарыңызбен бөлісу:

1 2 3 4 5 6 7 8 9 10 ... 57