«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
42
Fig. 1 - A force feedback device called The PHANToM by SensAble Technologies Inc
Haptic Robots such as the PHANToM was considered as a starting point of brainstorming in
order to decide what design to use, what interface to create, and what features to add. Concisely,
some users do not notice meaningful differences in hardness in an experiment where the users
recognize the hardness of an object or the workspace size although those users manipulate different
types of haptic interface devices [6]. PHANToM is a popular linkage-based haptic device (fig. 1).
The position and orientation of the pen are tracked through encoders in the robotic arm. Three
degrees of force, in the x, y and z, directions are achieved through motors that apply torques at each
joint in the robotic arm. The arm tracks the position of the pen (end-effector) and as a result, it is
required to determine the proper joint angles and torques necessary to exert a single point of force
on the tip of the pen. An alternative to a linkage-based device is one that is tension-based. Instead of
applying force through links, cables are connected to the point of contact in order to exert a force
[7]. Encoders determine the length of each cable. From this information, the position of a gripper
can be determined. Motors are used to create tension in the cables, which results in an applied force
at the grip. The pen as the end effector is a convenient way to perform manipulation tasks on the
robot, depending on the environment it is going to be used in. Our robot provides 3 DOF that we are
able to measure, but the end-effector is connected through a spherical base, providing another 3
DOF but this time, only to bring the joints to proper movement.
Robot motion of teleoperated systems is usually controlled by system operators with the
help of a camera mounted on robot or inspecting the area from above. Although vision systems
provide much information of the environment, they require much attention from the operator. To
overcome this problem, haptic devices provide operators with the additional sense of feeling the
robot workspace, thus making it easier to avoid obstacles and reducing the average number of
collisions. Usually, operators have to drive manually the mobile robot through obstacles by
explicitly specifying the robots angular and linear velocity. By doing so, they are fully in charge of
the robot motion and as a clear viewpoint of the robot environment may sometimes not be available,
they could accidentally drive the robot to collisions or choose longer paths than optimal ones. One
possible way to resolve this might be using a vibrating motor to give a force feedback to the
operator in order to notify them without blocking their visual or audial perception during the work.
The mobile robot haptic teleoperation system consists of two sides: the Master side, which contains
the haptic device and the master station with the map-building module and the Slave side, which
contains the mobile robot and a slave robot server with the behavior and the localization module for
the exploration of the unknown environment [8]. According to A.Tatematsu and Y. Ishibashi [6],
the efficiency of systems work is higher in the case where the workspace is uniformly mapped to
the virtual space in the directions of the x, y, and z-axes than in the case where the workspace is
individually mapped to the virtual space in the direction of each axis so that the mapped workspace
size corresponds to the virtual space size.
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
43
II. Interface design
Our research is devoted to developing the principles and tools necessary for the realization of the
advanced robotic and human-machine systems capable of haptic interaction. Considering a certain
type of sensors, it is necessary to focus on ACP Single-Turn Trimmer potentiometer CA9H5 1K
with shaft (fig. 2). There is number of terms in the electronics industry used to describe certain
types of potentiometers, where a trimmer pot is included. It can be described as a trimming
potentiometer which is adjusted once or infrequently for ”fine-tuning” of an electrical signal. The
user in order to move the robot should perform a hand movement. This different movement is
sensed by potentiometer attached to each joint. Output current should be monitored continuously to
protect drive mechanics and motor; switch-off limit can be adapted by trimmer potentiometer to suit
individual drive used.
Fig. 2 - CA9H5 1K trimmer pot
Generally, a potentiometer can serve as a position sensor for the haptic feedback loop
controlling the motor. To test methodology, study and develop a haptic interface that can be used to
teleoperate complex anthropomorphic robotic systems in an intuitive way, one of the first
approaches has been accomplished. A Simulink model was created (fig. 3). Regarding this, Analog
Input block corresponds to Real-Time Windows target block in order to receive data from
potentiometers for each joint position. Scope block retranslates data to MATLAB Workspace. Math
Function block enables a user to convert the data from potentiometers mounted on joints to radians.
Fig. 3 - First Simulink model and Matlab-Potentiometer circuit
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
44
III. Experimental results and future work
After the hardware of the interface assembling and established the connection with the
software we run tests to observe the behavior of the system. Despite several errors that were then
adjusted the haptic interface simulation might to give good results in terms of overall performance.
That includes the response of the hardware to software and vice-versa.
Fig. 4. Model of Desktop Haptic Interface
Desktop Haptic Interface (fig. 4) would be able to read the signals sent from potentiometers
and responded within short amount of time. Then the virtual model of STAUBLI and a box
representing the obstacle can be added. Both systems and models can make correct movements
while being operated by the user. For testing whether it is even possible to use the output of the
haptic system as the input for the actual robot the data should be recorded in a separate file and be
sent it to the input system of STAUBLI. As a result, the robot responds accordingly, with the
vibrations and cautions related to obstacle being taken into account.
IV. Conclusion
In conclusion, this paper described a haptic teleoperation interface with further
implementations on robots similar to Staubli. The future work would include developing the other
three DOF and increase the size of our interface to fulfill the initial idea of a wearable design.
Also eliminating the DC Power Supply would make the system autonomous. Alternative design,
materials and components might be used. More other sensors and vibrating motors as force
feedback elements might be implemented into different parts of the human arm in order to collect
alternative data from each point and get a fuller image on how such system would perform and what
outputs it would bring.
A number of other improvements can enhance the system and open the doors to many other
implementations. Such systems are already being implemented in many fields such as Robotic
Surgery where high precision movement is the first necessity. In industry, the safety signals coming
from the vibrating motors would activate operators tactile sensations apart from the audial and
visual ones. To prevent the unwanted collisions the haptic interface like ours would provide the user
with involuntary muscle contractions and guide the operator to tilt the joint in other directions.
Video games industry might use such system to enhance the reality sensations of the gamer.
Our project is a small contribution towards a number of researches related to haptic
teleoperation interfaces that will lead to more interesting existing implementations or to those that,
hopefully, will be created in the nearest future.
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
45
References:
1. C. B. Zilles and J. K. Salisbury, “A constraint-based god-object method for haptic display,” in
Intelligent Robots and Systems 95.’Human Robot Interaction and Cooperative Robots’, Proceedings. 1995
IEEE/RSJ In- ternational Conference on, vol. 3. IEEE, 1995, pp. 146–151.
2. B. Horan, D. Creighton, S. Nahavandi, and M. Jamshidi, “Bilateral haptic teleoperation of an
articulated track mobile robot,” in System of Systems Engineering, 2007. SoSE’07. IEEE International
Conference on. IEEE, 2007, pp. 1–8.
3. J. C. Perry and J. Rosen, “Design of a 7 degree-of-freedom upper-limb powered exoskeleton,” in
Biomedical Robotics and Biomechatronics, 2006. BioRob 2006. The First IEEE/RAS-EMBS International
Conference on. IEEE, 2006, pp. 805–810.
4. T. Hayashi, H. Kawamoto, and Y. Sankai, “Control method of robot suit hal working as
operator’s muscle using biological and dynamical information,” in Intelligent Robots and Systems,
2005.(IROS 2005). 2005 IEEE/RSJ International Conference on. IEEE, 2005, pp. 3063–3068.
5. A. Schiele and F. C. van der Helm, “Kinematic design to improve ergonomics in human machine
interaction,” Neural Systems and Rehabil- itation Engineering, IEEE Transactions on, vol. 14, no. 4, pp.
456–469, 2006.
6. A. Tatematsu and Y. Ishibashi, Mapping workspaces to virtual space in work
usingheterogeneous haptic interface devices. INTECH Open Access Publisher, 2010.
7. J. J. Berkley, “Haptic devices,” White Paper, Mimic Technologies Inc., Seattle, 2003.
8. N. C. Mitsou, S. V. Velanas, and C. S. Tzafestas, “Visuo-haptic interface for teleoperation of
mobile robot exploration tasks,” in Robot and Human Interactive Communication, 2006. ROMAN 2006.
The 15th IEEE International Symposium on. IEEE, 2006, pp. 157–163.
UDС 004
SHAYAKHMETOVA A.B., POLICHSHUK Y.V.
EMOTION RECOGNITION BASED ON IMAGE PROCESSING
(Kazakh-British Technical University, Almaty, Kazakhstan)
Abstract
Facial expressions are an inconceivably imperative part of human life and basic research on
emotions of the previous couple of decades has delivered a few revelations that have prompted vital
true applications. People can embrace an facial expressions willfully or automatically, and the
neural components in charge of controlling the expression vary for every situation. However solid
expression acknowledgment by machine is still a challenge. This paper presents application of the
machine learning system of support vector machines (SVM) to recognition and classification of
human emotions based on image processing.
Keywords: Expression recognition, Face detection, SVM
Introduction
Exact recognition and classification of facial expressions ends up being an extremely
troublesome errand. Despite immense efforts in computer hardware and software development,
including the improvement of sophisticated algorithms for machine learning, today still no
computational system exists that approximates the performance of humans. Traditionally, facial
expressions have been studied by clinical and social psychologists, medical practitioners, actors and
artists. However in the last quarter of the 20th century, with the advances in the investigation of
artificial intelligence, computer vision and computer graphics, 3D modeling and computer scientists
started showing interest in the study of facial expressions.
Face detection and feature extraction
Automatic emotion acknowledgment frameworks are partitioned into three modules:
1. Face Recognition
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
46
2. Feature Extraction
3. Expression Classification
First of all, I detected face from loaded into matrix image using Viola-Jones object detection
framework. The Viola–Jones object identification system is the primary framework that introduced
powerful object detection rates created by Paul Viola and Michael Jones in 2001. Although it can be
trained to detect a variety of object classes. This algorithm is implemented in OpenCV library,
which was also included in my project. Cascade of supported classifiers working with haar - like
components is now prepared in OpenCV with a couple of hundred specimen perspectives of a
specific objects subjective pictures of the same size. After loading different pertained on face
detection cascades they can be used to a region of interest in an input image. To search for the
object in the entire picture one can move the inquiry window over the picture and check each area
utilizing the classifier. So to discover an object of an obscure size in the picture the scan procedure
have to be done a few times at various scales.
Figure 1 - The current algorithm uses the following Haar-like features.
Figure 2 - Example of work of face recognition using Haar-like features.
Second step is extracting features like mouth, eyes and nose. The principle of work remains
the same. Only using haar-cascades change according to the needed region of interest. Also image
of detected eye pairs and mouth of each testing picture is saved for further usage for training
classifier.
Support Vector Machine
Support Vector Machines are a maximal edge hyperplane classification method that relies
on results from statistical learning hypothesis to ensure high speculation execution. Kernel
functions are utilized to proficiently map input data which may not be linearly separable to a high
dimensional feature space where linear methods can then be applied. SVMs display great order
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
47
precision even when only a humble quantity of training data is prepared, making them especially
suitable to a dynamic, intelligent way to deal with expression acknowledgment.
Figure 3 - Recognition of face, eyes, mouth and nose in my project compared to non-face based picture.
The frequently unpretentious contrasts recognizing separate expressions, for example,
«anger» or «disgust» in our displacement-based data as well as the wide range of possible variations
in a particular expression when performed by different subjects drove us to the appropriation of
SVMs as the classifier of decision. Selection of an appropriate kernel function allowed further
modification and improvement of the SVM classifier to our specific space of outward appearance
acknowledgment. Support vector machines have already been effectively utilized in an assortment
of classification applications including character and text acknowledgment as well as DNA
microarray data analysis.
Used Database, Evaluation and Conclusion
All images with were used training my classifier were taken from Cohn-Kanade AU-Coded
Expression Database. The Cohn-Kanade Database is for exploration in programmed facial picture
investigation, synthesis and for perceptual studies. Mostly scientists define 7 universal facial
expressions: Happiness, Sadness, Surprise, Disgust, Fear, Anger and Neutral. My underlying
execution effectively perceived expressions in 70% of trials, with ensuing changes including
selection of a kernel function customized to the training data boosting acknowledgment precision
up to 85%. Consolidating further conceivable upgrades, for example, expanding measure of training
data or performing programmed SVM model selection is prone to yield far better execution and
further build the suitability of SVM-based expression acknowledgment approaches in building
emotional and socially intelligent human-computer interfaces.
Future work
Application of emotion recognition in different spheres of life is major and vital. First of all,
such technology will help in building human-like robots and human-computer based interactions
itself. Also it can find utilization in social sphere like monitoring mood of visitors of various
institutions, cafes, restaurants and anywhere else where profit of company depends on customers’
replies. Collecting such information can ameliorate customer service. Another way of usage of this
technology can be utilized in development of different applications or systems that can change,
brighten human spirits by listening appropriate music or film. Moreover acknowledgment of
microexpressions from live stream video can improve accuracy of lie detectors and can find its
adaptation law enforcement system.
References
1. P. Michel and R. Kaliouby, “Real Time Facial Expression
Recognition in Video Using Support Vector Machines”
2. V. Bettadapura, “Facial Expression Recognition and Analysis: The State of the Art”
3. P. Wagner, “Machine Learning with OpenCV2”
4. C. C. Chibelushi, F. Bourel, “Facial Expression Recognition: A Brief Tutorial Overview”
«ҚОҒАМДЫ АҚПАРАТТАНДЫРУ» V ХАЛЫҚАРАЛЫҚ ҒЫЛЫМИ-ПРАКТИКАЛЫҚ КОНФЕРЕНЦИЯ
48
УДК 004.8
YESSENBAYEV ZH., KARABALAYEVA M.
A BASELINE SYSTEM FOR KAZAKH BROADCAST NEWS TRANSCRIPTION
(National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan)
Abstract
In this paper we describe our attempt to build a baseline system for Kazakh broadcast news
transcription. A neural network based acoustic model for the system was trained in Kaldi platform
using the previously available KazSpeechDB speech corpus and KazMedia speech corpus which is
the collection of the broadcast news from three different TV channels created specifically for this
task. A language model was trained in IRSTLM toolkit using mass media news available online.
The best word error rate of 4.06% was obtained for the Khabar channel.
Introduction
Nowadays, many media agencies produce and spread a large amount of audio and video
materials to the Internet, which often do not have the accompanying text description of the contents.
One of the main problems of the lack of texts or transcriptions to the audio and video materials is
the need to attract linguists or operators to recover the text, and it is accordingly entail additional
costs and time of organization. Consequently, the lack of text content of audio and video materials
effects on the poor quality of the search results of the news. This brings to the weak online presence
and lower ratings of media agencies compared to the foreign media and therefore the dissatisfaction
of Internet users. Furthermore, the absence of transcription limits impaired people or those who are
not able to hear from access to such content.
Our research aims to address the problem of transcribing the news in the Kazakh language,
using modern speech recognition technologies. Although the problem of broadcast news
transcription is well studied for foreign languages such as Arabic [1], English [2, 4], Chinese [3]
and others, it is very challenging problem due to the high variability of acoustic events in the data.
A common news track may contain acoustic segments with different speakers (male/female),
languages (Kazakh, Russian, English, etc.), channels (broadband/telephone), acoustic environments
(studio/outdoor) and noises (music, cars, etc.) which dramatically affect the accuracy of speech
recognition systems. Another challenge with respect to Kazakh is that many speakers are bilinguals
and mix Russian and Kazakh during the conversation.
In this work we present a baseline system for Kazakh broadcast news transcription. Here we
do not deal with the standard task such as segmentation and clustering of speech, language and
speaker identification and others, but we show our preliminary results on building and testing this
system on real data.
The following sections describe the speech corpus used for acoustic modelling, experiment
setup and the results of broadcast news transcription.
Достарыңызбен бөлісу: |