Computer-mediated communication
CUSTOMER SASTISFACTION WITH A NAMED ENTITY RECOGNITION (NER)
STORE-BASED MANAGEMENT APPLICATION BASED ON CUSTOMERS TO
CUSTOMERS-SUPPORT TEXT MESSAGES
Abstract
With rise of popularity of ecommerce, it is evident that the service retail industries aim to reduce
the inventory and increase sales and profit margins. To achieve this, it is of paramount
importance to establish good and effective interaction between customers and customer-support.
When a customer orders a product online, it is essential that the store establishes whether the
products are in stock and the nearest stores to where the customers are. Currently, the needs of
the customers are unlikely to be effectively met. Hence, the stores are unlikely to provide
desirable products to customers even when the inventory is high. This paper investigates this
issue at a typical and popular retail store in Vietnam. The authors present an investigation of this
issue through two main stages. Corpus analysis for a set of collected text messages, posted on the
stores’ websites between customers and customer-support, was firstly carried out to explore the
lexical patterns that indicate the customers’ needs. This analysis revealed the frequency of
customers’ requests of the stores’ locations where they can buy the goods and/or whether the
goods are in stock. In the second stage of the investigation, the valuable findings from the corpus
analysis were used for data extraction based on a Named Entity Recognition (NER) software.
The NER recognizes entities including locations and names.
Key words: customer satisfaction, text messages, Named Entity Recognition (NER)
1. Introduction
The increase of sales and profit margins is a major objective of the service retail industries. Good
customer services, i.e., a good interaction between customers and customer-support is considered
as playing the most important role to achieve the goal. Unlike the past when offline stores are
common, e-commerce becomes popular and wide-spread nowadays. Customers, therefore, used
to shopping online to contact stores mainly through websites and smart phone Apps (Ek et al
2011). However, there has been difficulties when a customer orders a product online. The
problems concern the store searching for the products to establish whether the products are in
stock and establish the nearest stores to where the customers are. The process often takes a while
when adopting current computer-based systems. In many cases, the needs of customers are not
met.
There is an evident concern in the service retail industries on how to reduce the inventory to
increase sales and profit margins (Sirimanna and Gunawardana 2020). In order to achieve this
goal, the ideal scenario relates to a good interaction between customers and customer-support.
The context of this research investigation involved popular retail stores where the importance of
this interaction is clearly highlighted.
As e-commerce becomes increasingly popular and wide-spread, customers have been considered
as online shoppers who contact stores mainly through websites and smart phone Apps. However,
when a customer orders a product online, it takes a while for the store to search for the products
to establish whether the products are in stock and establish the nearest stores to where the
customers are. Such activity, when adopting current computer-based systems, would take time
and in many cases, the needs of customers are not met. As a result, the stores are unlikely to
provide desirable products to customers even when the inventory is high. The increasing
customers’ complaints can be found in the customer support chat box in the company’s website.
The current research investigates this issue at a typical and popular retail store in Vietnam and
how a Named Entity Recognition (NER) software supports quicker and more efficient processes
of buying and selling goods. This research supports the utilization of text-messages (Thurlow
and Poff 2011) as means for good customer services in service sectors.
2. Literature Review
Computer-mediated communication (CMC) is the communication medium individuals use to
establish relational and social meaning, making use of technological advances. There are many
styles used for communications online, such as: emails, blogs, messaging apps online
conferencing, emails, etc. The term CMC refers to all these communications online or digital
platforms used for human interactions (Sheblom 2020, Altohalmi 2020).
The question related to how CMC impacts different aspects of human life, either individual
communications or organised group communications, for pleasure business or scholarly
activities, has been explored by researchers (December, 1996). CMC was evidenced to
facilitated effective learning experience in an environment where constructive alignment learning
philosophy can be implemented. This is with particular reference to designing learning
experiences for learners of second languages (Romiszowski and Mason 2004). CMC enables
users to enhance their interactions and exchange of ideas and opinions with other people of
similar interest and learning goals with similar interest in such conversations.
CMC tools can be synchronous or asynchronous. Synchronous CMC mode allow users to have
real time interaction and discussions, such as real time meeting rooms, video conferencing, etc.
This is similar to spoken interaction (Sykes 2005)
Asynchronous CMC does not happen in real time. However, it is based on recorded or published
information, which users access in their own time and not in real time. The time lag between
receiving the CMC and contributing an answer, allows the user to consider their answer
thoughtfully. Users are also able to reread or re-watch the recorded communication anytime they
wish (Altohami 2020, Newhagen 1996).
According to Berry (2004), the main advantages of asynchronous communication media
synchronous communication media include flexibility over time and distance combined, more
active and equal team member participation, the ability of users to reflect or collect data before
responding. The next section discusses how text messages are considered as one of the most
popular CMC modes that reflects the mentioned advantages.
2.1 Text messages
It is evident that the use of text messaging, referred to as SMS and texting, has gained huge
popularity in the last three decades (Altohami 2020). Texting has become appealing because it is
cheap, personal and unobtrusive. There are some issues related to clear understanding of
evolving texting messages which could result in mis-understandings (Thurlow and Poff 2011).
Despite these issues, text messages are effective means for social interactions such as greetings,
exchanging ideas, as well as providing help and support (Crystal 2008). Text messaging is also
used to organise real time synchronous interactions with families, friends, and colleagues. The
instant messaging (IM) technology offer users the ability to chat, over the internet, in real time
using text (Larson 2011).
Sirimanna and Gunawardana (2020), observed the development of technology noting that the
instant messaging (IM) apps can now allow the users to transmit texts but also images and files.
Such files are usually sent through the chat app as hyperlinks, or voice, or video. The instant
messaging is now prolific at the workplace as it allows effective and quick messages
communications.
Cho et. al (2005) indicate that IM can support better work performance of employees based on
the quality of communication and trust though it could interrupt work in some cases. IM,
therefore, is effective in enhancing other CMC tools. A more effective and comprehensive work
communication environment can this be created for the employees.
It is evident that previous studies related to CMC were concerned with the effectiveness of such
medium and the level of satisfaction of the user (Giri and Kumar 2010, Mukahi et. al 2003).
There seems to be a dearth of information related to previous research examing the impact of
CMC on customer satisfaction of how Named-Entity-Recognisation (NER) is applied. The
following sections will outline previous research efforts investigating NER.
2.2 Previous research into NER
There seems to be little published research efforts in relation to NER in constrained
environments. Jiang et al. (2010) investigated Hidden Markov Models (HMM), an SMS database
of 1000 messages. Named Entities related to events and activities were extracted from Chinese
SMSes. Such findings were specifically aimed at use with handsets.
The research carried out Polifroni et al (2010) adopted logistic regression to recognise named
entities, such as: time, date, location, from spoken and typed messages. This research was
carried out in a controlled laboratory setting where the SMSes and spoken words of the
participants were transcribed to establish a database. The research effort was focused on
automatic speech recognition for mobile phones.
Hård af Segerstad (2002) provided the analysis of linguistics of Swedish test messages
characteristics and usage of SMS in Sweden. The research findings showed interesting aspects
related to the use of abbreviations borrowed from other languages and not known in the Swedish
language.
Ek et al (2011) investigated the implementation of the NER application for text messages in
Swedish on a mobile flatform. The research supports the ideas that the extraction of entities
including names, locations, dates, time and telephone number could be used by other application
used on the phones.
In Vietnam, there is little research on NER. Pham (2020) discusses VLSP 2016 as a Vietnamese
training set established by Association for Vietnamese Language and Speech Processing-
vlsp.org.vn). It provides two data sets: Named Entity recognition and Sentiment analysis. Four
named-entity categories were adopted in VLSP 2016. These are similar to those considered in
the shared-task of CoNLL 2003. They are person (PER), organization (ORG), location (LOC),
and miscellaneous entities (MISC).
There are two approaches for machine learning. One approach is a conventional machine-
learning methodology and the other is deep-learning one. Such models and methodologies can
be applied to NER system in VLSP2016 such as Hidden Markov Models, Support Vector
Machines, Conditional Random Fields (CRFs), Maximum-Entropy-Markov Models (MEMMs)
or recurrent neural network (RNN) with LSTM units (Li et al. 2020, Pham 2019), PhoBERT
(Nguyen and Nguyen 2020).
Among the above approach, BERT (the Bidirectional Encoder Representations from
Transformers) is considered as one of the most common Pre-trained language models (Devlin et
al. 2019). As an open source like Spacy and Open NLP, BERT has become popular. Huge
improvements for a variety of NLP tasks have been achieved through BERT which is now well
supported and prolifically used. The currently available BERT tools have been mainly applied
for English language communications. However, the architecture of such BERT tools could be
retrained or existing pre-trained multilingual BERT-based models can be employed (Devlin et al.
2019; Lample and Conneau 2019).
In this study, the authors use BERT to name entities. NER software in this research recognizes
entities including locations and names. This NER supports quicker and more efficient processes
of buying and selling goods at a typical and popular retail store in Vietnam.
3. Methodology
The study aims to establish good and effective interaction between customers and customer-
support through analysing text-messages. The two research questions are:
(1) To what extent does Named Entity Recognition (NER) software support more efficient
processes to increase sales and profit margins in the retail industry?
(2) How satisfied are the customers with the support of the NER?
In order to answer the two research questions, the study includes two main stages and a pilot
evaluation. Stage 1 relates to corpus analysis for a set of collected text messages to explore the
lexical patterns that indicate the customers’ needs. In the second stage of the investigation, the
valuable findings from the corpus analysis were used for data extraction based on a Named
Entity Recognition (NER) software including locations and names. Finally, a pilot evaluation of
customers’ satisfaction of the revised app is carried out based on the data collected via
customers’ comments on website.
3.1 Stage 1
In the first stage of the research, the authors collected messages exchanges (Altohami 2020)
between customers and customer-support on the website of the company including chat box
support, comments on the website and chat application such as Zalo, Messengers. Messages were
short conversations, mainly on customers’ asking whether products they wanted to buy are in
stock, the addresses of the stores where they can collect the products or providing their own
addresses for deliveries. The messages also related to customers’ complaints on products,
supporters’ replies identifying the addresses of the buyers before transferring the issues to
departments which have responsibilities.
These messages were written in Vietnamese, some of which were common ‘borrowed’ English
such as ‘thanks’, ‘inbox’, ‘chat’. These English words were often seen among young customers.
There were about 3500 text messages collected. The data collection took into consideration any
ethical and confidentiality issues. The researchers applied the company’s policy of keeping
customers’ information confidential. The data covered aspects of customer-customer support as
it was collected from all the means of communication to customers offered by the company. We
annotated the dataset with the categories of name entities: date, time, product, location and
address.
Text messages, posted on the stores’ websites between customers and customer-support were
collected from December 2021 to February of 2022. This is the time when the needs of buying
products increase due to the Tet holiday (the biggest holiday in Vietnam). This period was
chosen as the demand for effective customers support services increases and affects the retailing
process.
Figure 1. The two stages of the research
We divide our system into two components as it is shown in Figure 1. The first component
consisted ofreallife text-based communication core, which was collected from three sources:
mobile chat application, chatbox support on website, and comments on websites. This
component of the system was utilised to generate the datasets for implication of the second stage.
Secondly, with the collected raw dataset, some issues concerning spelling mistakes, redundancy,
or punctuation need to be addressed before proceeding the data. In details, the issues that need to
sort out include:
The first issue concern Unicode application. Vietnamese applies two types of Unicode which
include two different byte (UTF-8). This causes difficulties to proceed the text. For example:
b'Ti\xe1\xba\xbfng Vi\xe1\xbb\x87t’
b'Tie\xcc\x82\xcc\x81ng Vie\xcc\xa3\xcc\x82t’
The difference in showing the tone (e.g. ‘oè’ and ‘oè’) which could lead to the difference in the
meaning. The solution for this is the use of Normalization Form C (a defined normalization of
Unicode strings which make it possible to determine whether two Unicode strings are equivalent
to each other-Wiki Media) to standardize one typy of Unicode.
The second issue relates to the error of using space. This is solved by space deletion. The third
issue concern the error of writing the name in lower case which could convey different
meanings. For example, ‘hồ Xuân Hương’ means ‘the lake named Xuân Hương’ insteads of ‘Hồ’
as sirname. The solution for this issue is to set lower case for all texts.
3.2 Stage 2
In the second stage of the research, the text messages corpus collected in stage 1 is used as an
input. With 3500 conversations, there are about 15000 common words in Vietnamese including
addresses, customers’ names, names of products and materials. The recognition procedure was
carried out using Wordpiece.
Wordpiece is a tokenizer method which is used to split a sentence into parts including words in
this study. The common issue is that some of these words can not be found in dictionaries, which
are called UNK words. In comparison with old versions, e.g., BPE PhoBERT (Nguyen and
Nguyen 2020). Wordpiece helps to solve the problem by splitting words which have high
frequency into subcategories (clusters, vowels and consonants) (Figure 2). This method helps to
have UNK words as subwords which can be easier to predict the meanings.
Figure 2. Wordpiece spliting sentences into parts
For instance, ‘cty Thế Giới Di Động’ is splited to ‘cty’, ‘thế’, ‘giới’, ‘di’, ‘động’. If ‘cty’ (means
‘công ty’-company) can not be found in dictionaries because it is written in shorthand technique,
Wordpiece would help to have two subwords ‘c’ and ‘ty’. The meaning can be revealed from the
two tokens.
The other example concerns the lack of space from customers’ texting ‘hiện gà này ở bhx chợ
phú thọ trảngdàibiênhòa đn có không ạ’, Wordpiece will split all the strings into words ‘hiện, gà,
này, ở, bhx, chợ, phú, thọ, trảng,##dài,##biên,##hòa, đ,#n, có, không, ạ’ to proceed for the
meanings.
After splitting words, the statistics is carried out to find out the frequency of the 15.000 words.
This is to build the list of words that are able to merge to each other, as it is shown in Figure 2.
above.
Label engine (Lafferty et al. 2008) is the next step in this stage. In this step, the list of words
provided from the step of tokenizer engine and is given labels as it is shown in Table 1 below:
[2]
Label Description Examples
1 B-LOC Begin Location Chung [B-LOC]
(The intial of ‘apartment building’: chung
cư)
2 I-LOC Inside Location Cư [I-LOC]
(The ending of ‘apartment building’:
chung cư)
3 B-ADD Begin Address Alexander [B-ADD]
4 I-ADD Inside Address de [I-ADD]
Rhodes[I-ADD]
5 B-PROD Begin Product Nước [B-PROD]
(The intial of ‘fish sauce’:nước mắm)
6 I-PROD Inside Product Mắm [I-PROD]
(The ending of ‘fish sauce’: nước mắm )
0 O Words which do
not belong to the
above groups
Chai [O]
bottle
Table 1. Lable classification
There are 3 groups of label classification: Location, Address, Product. Each group includes 2
subgroups: B- (begin), I-(inside) (Pham 2018). The last group 0 includes words which do not
belong to the first six groups. This follows a common tagging format for tagging tokens in
computational lunguistics presented by Ramshaw and Marcus (1995). The B-prefix before a tag
indicates that the tag is the beginning of a block. The tag is shown inside a chunk with I- prefix
before a tag. An O tag indicates that a token belongs to no block.
The third step is the step of mapping tokens to labels. For example, “Một chai nước mắm đến
chung cư đường Alexander de Rhodes” is encoded as below (Table 2)
Token
s
Một chai nước mắm đến Chung Cư đường Alexande
r
De Rhode
s
IDs 11 13 15 16 8 56 34 17 88 98 332
Labels O O B-
PRO
D
I-PRO
D
O B-
LOC
I-
LOC
O B-ADD I-
ADD
I-ADD
IDs 0 0 5 6 0 1 2 0 3 4 4
Table 2. Mapping tokens to labels
The combination of each couple of token ID and label ID is call an offset. A number of offsets
form an encoded sentence. For example, these offsets from the table x.x [(11,0),(13,0),
(15,5),(16,6),(8,0),(56,1),(34,2),(17,0),(88,3),(98,4),(332,4)] makes the sentence “Một chai nước
mắm đến chung cư đường Alexander de Rhodes”. A dataset contains such encoded sentences. In
this study, 12250 sentences from 3500 conversations are proceeded through this process.
In the final step, the dataset is input to the NER engine. Tokens are recognized to classify
according to the labels such as B-LOC, I-LOC, B-ADD, I-ADD, B-PRO, I-PRO, O. For
example, the encoded input of “Hồ Xuân Hương” will be classified “Hồ” as B-ADD, “Xuân” as
I-ADD, “Hương” as I-ADD. Then, the merging process will happen to form the entity of address
“Hồ Xuân Hương”. The similar processes are applied to identify the entities of location and
product.
3.3 A pilot evaluation on customer satisfaction
A pilot experiment was carried out for a preliminary evaluation of customer satisfaction. The
proposed chat facility within the NER Application, was used for a pilot investigation for a
duration of two weeks. Around 600 conversations in the chatbox were collected during this time.
The messages were mainly related to customers asking for the products they wanted to buy and
the nearest stores to where the customers were. These messages were reviewed to establish the
overall satisfaction of the customers who chatted with customer-support.
In order for the feedback of the customers on using the revised Application, an online feedback
form was designed and attached to the website so that the customers can fill it in after the
conversation exchanges with customer-support. The form contains 5 main questions which
attempted to quantify the customer’s self-assessment of their use of the revised chatting app.
These questions were designed to establish out how much the customers were satisfied with the
customer support service. 387 out of 600 customers completed the online feedback form.
Participants answered the questions on Likert’s scale that describes their choice from 1 (very
bad) to 5 (very good). One open-ended question (question 6) for aspects of triangulation of the
answers and further comments was used.
These questions in the feedback form are as follows:
1. How much do you like the chatting app?
2. How fast do you have the answers for the products you want to buy?
3. How fast do you receive the answers for the nearest stores?
4. How much are the suggestions on the website appropriate to your needs?
5. What scale do you choose to rate the level of service satisfaction? (from 1 very bad – 5
very good)
6. Which Application version do you prefer? The old version or the new version?
4. Results and discussion
In this section the data collected from corpus analysis and NER software (Larson 2011, Pham
2018) will be explored to find out how NER support good interaction between customers and
customer-support. In the first part, locations refer to customers’ addresses, as provided by them,
to establish the nearest store to them which can supply the ordered products. The locations
database, therefore, also refers to the stores’ addresses, thus linking the customers to the nearest
locations of stores. Names database refers to the products and the names of customers. In the
second part of the section, the customers’ satisfaction with the support of the NER will be
discussed.
4.1 NER software supports more efficient processes to increase sales and profit margins in the
retail industry
4.1.1 The common entities reflecting customer’s needs
The entities of ‘products’ and ‘locations’ appear in high frequency in the text messages, 96 %
and 89% respectively. Table. 3 indicates that the customers who contact to buy products often
enquires about where they can find the products they want. Customer support, in return, ask the
customers for their addresses to serve and meet their needs.
Details Frequency Labels 3500 for
total
Sample date Friday 79% DATE 2765
Sample time 5.PM - 10.PM TIME
Sample
telephone no
+092.xxxxx PHONE
Product Bò Úc, rau , sản phẩm 4kFarm, sữa
Vinamilk
(Australian beef, vegetable, 4kFarm,
Vinamilk)
96% PRODUCT 3360
Location Gần trường học, Vinhome, Nhà văn
hoá thanh niên.
(near schools, Vinhome, behind Youth
Cultural House)
89% LOCATION 3115
Address Phường Tỉnh quận huyện thành phố
(Ward, District, City)
83% ADDRESS 2905
First name &
Titles
Cô Lê, Chú Ba, Anh Bảy, chị Hai
chủ nhà, chủ quán kế bên nhà (house
owner, the shop owner who lives next
door)
72% NAME 2520
Table 3. The frequency of common entities
The sample data indicated that 89% of customers provided their location while 83% provided
their address as well. Customers provide their addresses to be supported regarding the stores
where they can find the products they need. This shows that these entities appear in almost all
customers to customer-support conversations. Interestingly, these entities appear together with
high frequency.
4.1.2 Date and time related to customers’ purchases and needs
Table 4. shows that customers often enquired about ordering food around 10 a.m-1p.m and 6
p.m-11p.m in weekdays and 9 a.m to 11 p.m at weekends and holidays. The required Products,
mainly fish, meat, eggs, milk, vegetables and fruits, were quite varied in weekdays.
Date Time Product
Monday 10AM - 1PM,
6PM - 10PM
Thịt, cá, trứng, sữa, rau tươi, trái cây
(Meat, fish, eggs, vegetables, fruits)
Tuesday 10AM - 1PM,
6PM - 10PM
Thịt, cá, trứng, sữa, rau tươi, trái cây
(Meat, fish, eggs, vegetables, fruits)
Wednesday 11AM - 1PM,
6PM - 10PM
Thịt, cá, trứng, sữa, rau tươi, trái cây
(Meat, fish, eggs, vegetables, fruits)
Thursday 11AM - 1PM,
6PM - 11PM
Thịt, cá, trứng, sữa, rau tươi, trái cây
(Meat, fish, eggs, vegetables, fruits)
Friday 6 PM - 12 PM Hải sản, thịt, rau củ
(Seafood, meat, vegetables)
Weekend 9AM - 11 PM6 Bánh kẹo, nước ngọt, trái cây, đồ dùng một
lần (Sweets, soft drink, fruits, disposable
products)
Holiday 9AM - 11 PM6 Bánh kẹo, nước ngọt, trái cây, đồ dùng một
lần (Sweets, soft drink, fruits, disposable
products, disposable products)
Thịt & hải sản
(Meat & sea food)
Table 4. Types of products acccording to date and time
Evidently, the customers enquires are varied according to date and time. For example, at
weekends, entertainment related food such as beer, soft drinks, cakes, sea food as well as
disposable products have higher frequency of orders, particularly from the suburbs of Ho Chi
Minh City (HCMC).
4.1.3 Products – Location relationship related to customers’ needs
The analysis of the collection of the written texts (Altohami 2020) with Wordpiece showed that
common products such as meat, fish, eggs, vegetables, fruits, milk, cheese, yogurt, cereal are
similarly required by customers from rresidential areas, office buildings and industrial parks
(Table 5). This demonstrates the high needs for these products.
Residential areas
● High population areas
e.g. Bình Thạnh, Phú Nhuận, Gò
Vấp,
Tân Bình, Tân Phú districts.
● Apartment buildings
e.g. Vinhome Center Park, Vinhome
Grand Park
● Dormitories
e.g. University Dormitory in Thu
Duc)
Office buildings & Industrial Parks
● District 1, 3
● High Tech Park
● Quang Trung Software City, district
12
● Tan Tao Industrial Park, Tân Binh
District
Vinh Loc Industrial Park, Binh Chanh
district
Item % Item %
Đồ vệ sinh cá nhân (Toiletries)
Nước rữa tay (Hand sanitizer)
Khẩu trang (Face Mask)
43
%
Đồ dùng 1 lần (Disposable items)
Nước rữa tay và khăn giấy (Hand
sanitizer and tissue)
83
%
Đồ khô (instant noodle)
- Gạo và ngũ cốc (Rice & cereal)
- Đồ chay (Vegan food)
- Gia vị (Flavour)
- Sweet food (Bánh kẹo)
85
%
Thức ăn nấu sẵn (Ready-to-eat food)
Snack & Sandwich
Bánh kẹo (Sweets)
Mì ly, xúc xích (Instant noodle and
sausage)
89
%
Đồ uống (Beverages)
Rượu, bia, nước ngọt (wine, beer, soft
drink)
Nước khoáng (Mineral water)
Nước yến (bird net)
Trà và cà phê (Coffee & tea)
Nước uống sô cô la (Chocolate drink)
71
%
Đồ uống (Beverages):
Rượu, bia, nước ngọt (wine, beer, soft
drink)
Nước khoáng (Mineral water)
Trà và cà phê (Coffee & tea)
81
%
Thịt, cá, trứng, rau,
Trái cây (Meat, fish, eggs, vegetables,
fruits)
92
%
Trái cây (fruits) 65
%
Sữa, phomai, yaourt, ngũ cốc
(Milk, cheese, yogurt, cereal)
82
%
Sữa, phomai, yaourt, ngũ cốc
(Milk, cheese, yogurt, cereal)
85
%
Đồ đông lạnh (frozen food) 38
%
Đồ đóng hộp (canned food) 23
%
Table 5. Products – location and location-products
However, customers who live in areas where there are more apartment buildings (e.g. Vinhome
Centre Park in District 2, Vinhome Grand Park in district 9, Tan Binh and Tan Phu where the
population is high) requires toiletries more. Ready-to-eat food, disposable products are required
more in areas where there are more office buildings and industrial parks. It is worth noting that
the requirement of all products is more from the suburbs such as Thu Duc rather than from the
centre of the city (district 1,3). This reflects the recent trend of expansion of the residential areas
from the center to the suburbs of HCMC.
Based on such information related to the required products and locations, which was established
with the use of NER (Li et al. 2020), effective suggestions to match customers to products and/or
location could be provided. This is crucial to provide good customer service responding to
customer needs.
4.1.4 The use of lexical features for good interaction between customers and customer-support
In this section a number of lexical features of the communication data collected for this research,
such as, titles prefixing people’s names, will be explored (Table 6).
Titles prefixes
In Vietnamese, it is important to show politeness by using the titles. ‘Anh’ for male customers
and ‘chị’, ‘em’ for female customer are most commonly used as they are for the majority of
customers. In the case that the customers are known to be older people, ‘cô’ ‘chú’. ‘bác’ are
used.
Shorthand techniques
Common shorthand techniques which are used by customers are: Tks = thanks, ac = anh chị, e =
em, k/ko = không(can’t), dc = được(can), hsd = hạn sử dụng(expiry), while NSX (Ngày sản xuất
(Manufatured date), HSD (Hạn sử dụng (expire date), SL (Số lượng, số lô (quantity, Lot number)
are often used by customer support.
Common Emojis and Emoticons. Customers usually expressed their positive feelings or negative
feelings by using emojis and emoticons, mainly for happy (:-)��) or sad (:-( ☹).
Greetings and Questions. Some questions that are often seen in the collected text messages are:
“Ac có đang inbox ko?”, “Allo ac” (Are you inbox?, Allo?). In many cases, these questions are
considered as greetings.
Lexical features Customers Customer support
Greetings Chào ac Chào anh/chị
Titles prefixes Male: Anh Male: Anh, chú
Female: Chị Female: Chị, cô
Both: Em, tôi Both: Em, quý khách, bác
Shorthand
techniques
Tks = thanks
ac = anh chị
e = em
k/ko = không
dc = được
hsd = ha
…
Cảm ơn (Thanks)
NSX (Ngày sản xuất-
Manufactured date)
HSD (Hạn sử dụng – Expiry)
SL (Số lượng, số lô- Quantity, Lot
number)
cty (Công Ty - Company), tnhh
(Ltd)
Hedges/ Hegdes Vâng (Yeah)
ạ
dạ
ơi
Vâng (Yeah)
ạ
dạ
Emoticons Happy:
happy :-) or ‘hehe’
Happy:
happy :-) or ‘hehe’
Unhappy:
sad :(, confused =/, cool
B-)
Sorry feelings when the needs of
customers are not met yet
:(( :-((
Table 6. Lexical features
It results in the high frequency of the replies such as ‘Dạ có ở cửa hàng Nguyen Dinh Chieu, gần
chổ chị” (Yes, we sell them in Nguyen Dinh Chieu near your place) than ‘Xin lỗi, hiện chúng tôi
hết hàng’ (‘Sorry, we are out of it) or ‘Anh/chị chờ một chút ạ’ (Please can you wait for a while’)
from customer supports. The more the replies were ‘Yes, (có) than ‘Không có’ (‘No’), the more
happy emoticons were shown in customers’ messages.
In summary, there are two common questions the customers made: (1) whether the products they
need are in stock (products-location) and (2) providing their addresses to request the nearest
stores’ locations where they can buy the products (location- products).
As for products-location, the results show that, as expected, customers require products close
their locations. Generally, all customers enquires were related to, not just the products but, as
expected, to the nearest store to them. The findings reveal that with the support of NER engine
(Nguyen and Nguyen 2020), when the needs of customers increased, the retail system of the
company was able to meet such needs effectively as evident by the high frequency of the replies
of ‘Yes, (có) than ‘Không có’ (‘No’) and high frequency of more happy emoticons (happy :-) ) in
the customers’ messages.
As for location-products, if the customer’s desired store is out of stock of the product, the
proposed NER system could provide appropriate alternative stores where the product is in stock.
This resulted in significant reduction of complaints from the customers in relation to clarity of
product availability in different locations. NER has supported the process of searching for the
addresses and give it to the delivers to make sure products can be delivered to the customers
quickly.
In addition, the research data included evidence of the current trend of expansion of residential
areas from the center to the suburbs of HCMC. Hence, the proposed NER system will provide
appropriate arrangements of products and delivery to meet customers’ needs in the wider suburbs
of HCMC.
4.2 The customers’ satisfaction with the support of the NER
In this section, the results from a pilot evaluation of customers’ satisfaction are discussed. As it is
shown in table 7, the majority of customers found that the proposed App is supportive in helping
them to buy the products they want.
The percentage of customers who chose “5” or “4” (A great deal/Good) to the question related to
“how fast they have the answers” was 60%. Similarly, the percentage of customers who chose
“5” or “4” (A great deal/Good) to the question related to “how much the suggestions on the web
are appropriate to your needs” was 80%. The responses of “How much do you like the app”
showed that the majority of the customers, total of 86%, responded with either 5 or 4 (a great
deal) similar to the chatting app.
As for the last question related to the customers’ preference of the version of the app (old or new
version), 90% of the customers said they preferred the revised app. The reasons for their answers
were as follows:
“I prefer the new version because I received quick answers. It makes me feel I am cared for.”
“I prefer the new version because it is very friendly and natural. I don’t feel like I am talking
with a machine.”
“I often receive the right products to buy and can find the right shops easier.”
A little/ Bad Average A great deal/Good
1 2 3 4 5
How much do you like the
chatting app?
4
(1%)
50
(12.9%)
100
(25.8%)
153
(39.5%)
180
(46.5%)
How fast do you have the
answers for the products you
35
(9%)
45
(11.6%)
50
(12.9%)
100
(25.8%)
157
(40.6%)
want to buy?
How fast do you receive the
answers for the nearest
shops?
45
(11.6%)
50
(12.9%)
50
(12.9%)
70
(18.1%)
172
(44.4%)
How much are the
suggestions on the website
appropriate to your need?
10
(2.6%)
32
(8.2%)
88
(22.7%)
120
(31%)
130
(33.5%)
What scale do you choose to
rate the level of service
satisfaction?
4
(1%)
56
(14.5%)
63
(16.3%)
98
(25.3%)
166
(42.9%)
Table 7. The customers’ feedback on using the revised app
5. Conclusion and Recomemdation
The research showed that customers had strong positive responses to the proposed App. This
suggests that NER system can provide more effective practices for interaction between
customers and customer-support customer-support. Therefore, the design and implementation of
such apps is recommended. This leads to the following conclusion.
The Named Entity Recognition (NER) system, which is proposed by the authors in this paper,
was found to increase the efficiency of the retail process in the age of increased popularity and
wide-spread of e-commerce. The effectiveness was evident in both of the investigated aspects of
the process, namely: location-products and products-location. The NER system provided a
platform for quick and effective customer interface to find the products of interest as well as the
nearest store location where the products are in stock. The user interface of NER provides the
first option of location-products with output statements related to whether the product is
available in stock at the location of the customer. In the situation when the product is not
available at the customer’s location, NER data engine switches to the second option of products-
location. This option integrates the database to list all possible store where the product is
available in stock.
The authors are now seeking collaborations from various e-commerce retail sectors in various
geographical locations to expand this research with a view to refine a targeted database linking
customers to their preferences and needs. This aspect of further research will employ tools form
Artificial Intelligence (AI).
Limitation
The paper was limited in several ways. Firstly, the data was collected mainly through the
company website. Though it is a major source to communicate with the customers at the
company, the authors would like to explore the research issues in other means of
communication, e.g. email. However, this kind of information is unable to access because
it is internal company documents.
Secondly, the data is limited in one company. In order for the better validity and
generalization of the findings, other company data should be incooperated in the
development of the research in the future.
This research is partly funded by University of Economics Ho Chi Minh City (UEH), Vietnam.
References
Altohami, M.A Waheed (2020), Text messages: A computer-mediated discourse analysis.
International Journal of Advanced Computer Science and Applications, Vo.11, No.7,
pp.79-87.
Berry, G. R. (2004). Lessons from the Online Teaching Experience. Journal of the Academy of
Business Education, 5, 88-97.
Cho, H.-K., Trier, M. and Kim, E. (2005) The use of instant messaging in working relationship
development: A case study, Journal of Computer -Mediated Communication, 10, 4:
http://jcmc.indiana.edu/vol10/issue4/cho.html
Customer satisfaction. (2023, Jan 05). Wikipedia.https/wikipedia.org/wiki/Customer_satisfaction
Crystal, D. (2008). Txtng: The gr8 db8. New York: Oxford University Press.
December, J. (1996). Units of analysis for Internet communication. Journal of Computer-
mediated Communication, Vol. 1, No. 4.
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.(2019). Bert: Pre-training of deep bidirectional
transformers for language understanding. In Proceedings of NAACL, pages 4171-4186.
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv
preprint, arXiv:1412.6980.
Ek T., Kirkegaard C., Jonsson H., Nugues P. (2011). Named Entity Recognition for Short Text
Messages. Social and Behavioral Sciences 27 (2011), pp. 178 – 187
Giri, V. N., & Kumar, P. (2010). Assessing the Impact of Organizational Communication on Job
Satisfaction and Job Performance. National Academy of Psychology (NAOP) India.
Hård af Segerstad Y . (2002). Use and adaptation of written language to the conditions of
computer-mediated communication. Doctoral thesis, Göteborg University.
Jiang H., Wang X., & Tian J. (2010). Second-order HMM for event extraction from short
message. In Proceedings of NLDB, pages 149–156.
Lample, G., & Conneau, A. (2019). Crosslingual Language Model Pretraining. In Proceedings of
NeurIPS, pages 7059–7069.
Larson, G. W. (2011). Instant Messaging. Ensiclopedia Britannica.
Lafferty, J., McCallum, A., & Pereira, F. (2008). Conditional Random Fields: Probabilistic
Models for Segmenting and Labeling Sequence Data. Semantic Scholar. 282-289.
Li, P. H., Fu, T. J., & Ma, W. Y. (2020). Why Attention? Analyze BiLSTM Deficiency and Its
Remedies in the Case of NER. In AAAI '20. 8236--8244.
Mukahi, T., Nakamura , M., & Not, R. D. (2003). An Empirical Study on Impacts of Computer-
Mediated Communication Management on Job Satisfaction. Adelaide, South Australia: 7th
Pacific Asia Conference on Information Systemss.
Named-entity recognition. (2023, Jan 05). Wikipedia. https://en.wikipedia.org/wiki/Named-
entity_recognition
Newhagen J. E., S. R. (1996). Why communication researchers should study the internet: A
dialogue. Journal of Communication. Vol. 46, No. 1, pp. 4-13.
Nguyen, D. Q., & Nguyen, A. T. (2022). PhoBERT: Pre-trained language models for
Vietnamese. ACL Anthology.
Pham, P. Q. M. (2018). A feature-rich vietnamese named-entity recognition model. arXiv
preprint arXiv:1803.04375.
Polifroni J., Kiss I., & Adler M. (2010). Bootstrapping named entity extraction for the creation of
mobile services. In Proceedings of LREC.
Ramshaw, L. A. and Marcus, M. P. (1995). Text Chunking using Transformation-Based
Learning". Computation and Language. Vol.11, pp.82–94.
Romiszowski, A. and Mason, R. (2004). Computer-mediated communication. In Handbook of
research on educational communications and technology, D. H. Jonassen, Ed. Mahwah,
NJ: Lawrence Erlbaum Associates, pp. 397-431.
Sirimanna, U.I. , and Gunawardana, T.S.L.W. (2020). Impact of Computer Mediated
Communication Systems on Job Satisfaction: Employees in the Transmission Division of
Ceylon Electricity Board, Sri Lanka. The 9th International Conference on Management and
Economics.
Sykes, J. M. (2005). Synchronous CMC and pragmatic development: Effects of oral and written
chat. CALICO Journal, Vol. 22, No. 3, pp. 399-431.
Thurlow, C. and Poff, M. (2011). Text messaging. In Handbook of the pragmatics of CMC, S. C.
Herring, Stein, D., Virtanen, T. , Ed. Berlin and New York: Mouton de Gruyter.
Additional Readings
Chiu, J., and Nichols, E. (2016). Named entity recognition with bidirectional LSTM-CNNs.
Transactions of the Association for Computational Linguistics.
Cui Y., Che, W, Liu, T., Qin, B., Yang, Z, Wang S., and Hu, G. (2019). Pre-Training with Whole
Word Masking for Chinese BERT. arXiv preprint, arXiv:1906.08101.
De Vries, W., Van Cranenburgh A., Bisazza A., Caselli T., Van Noord, G., and Nissim, M.
(2019). BERTje: A Dutch BERT Model. arXiv preprint, arXiv:1912.09582.
Huang, Z.; Xu, W.; and Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging.
arXiv preprint arXiv:1508.01991.
Lin, B. Y.; Xu, F.; Luo, Z.; and Zhu, K. (2017). Multi-channel BiLSTM-CRF model for
emerging named entity recognition in social media. In Proceedings of the 3rd Workshop on
Noisy User-generated Text.
Ma, X., and Hovy, E. (2016). End-to-end sequence labeling via bi-directional LSTM-CNNs-
CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational
Linguistics (Volume 1).
Nguyen H., Ngo H., Vu L., Chan V., and Nguyen H. (2019). VLSP Shared Task: Named Entity
Recognition. Journal of Computer Science and Cybernetics, 34(4):283–294.
Nguyen, K. A., Dong, N., and Nguyen Cam-Tu. (2019). Attentive Neural Network for Named
Entity Recognition in Vietnamese. In Proceedings of RIVF.
Ratinov, L., and Roth, D. 2009. Design challenges and misconceptions in named entity
recognition. In Proceedings of the Thirteenth Conference on Computational Natural
Language Learning (CoNLL-2009).
Wu, S. and Dredze, M. (2019). Beto, bentz, becas: The surprising cross-lingual effectiveness of
BERT. In Proceedings of EMNLP-IJCNLP, pages 833–844.
Key Terms and Explanations
Named-entity recognition: a subtask of information extraction that seeks to locate and classify
named entities mentioned in unstructed text into pre-defined
categories such as person names, organizations, locations, time
expressions, quantities, monetary values, percentages.
Customer satisfaction: a term frequently used in marketing. It is a measure of how products
and services supplied by a company meet or surpass
customer expectation.
Text message: real-time text transmission over the Internet.
Dr Le Thi-Hong Vo
University of Economics Ho Chi Minh City (UEH)
Vietnam
Dr. Le Thi-Hong Vo hold a PhD in TEFL/TESOL from the University of Portsmouth, U.K. With
over 12 years She is particularly interested in materials design, using technology in English
teaching and how English language training can be improved with IT support, classroom
research, teacher education, global Englishes and English language communicative competence
and intercultural communication required of graduates at the workplace.
Thien Hang Tuan
Mobile World Investment Corporation
Thien Hang Tuan is currently a developer at Mobile World Investment Corporation and a
lecturer at Mindx Technology School. He received a Bachelor of Engineering in software
engineering from the University of Information Technology, Vietnam National University. He
worked as a researcher at Soongsil University, Korea in 2019. His work
focuses on Automated communication systems and Fault prediction systems.
Dr Ayman Yossef Nassif
University of Portmouth, U.K
Dr Ayman Nassif is a highly motivated educator, researcher and consultant engineer.
His teaching and research experience spans solid mechanics, materials, structural
engineering at all undergraduate and postgraduate levels. His teaching expertise
includes fire structural engineering, thermo-mechanical FE modelling, concrete
materials and technology. His industrial experience includes structural design and
supervision of construction of buildings, roads, bridges, telecommunication towers and
irrigation systems. His experience was gained in Egypt, Finland, UK, Switzerland and
Vietnam.
Tags:
docs