Computer-mediated communication

CUSTOMER SASTISFACTION WITH A NAMED ENTITY RECOGNITION (NER)

STORE-BASED MANAGEMENT APPLICATION BASED ON CUSTOMERS TO

CUSTOMERS-SUPPORT TEXT MESSAGES

Abstract

With rise of popularity of ecommerce, it is evident that the service retail industries aim to reduce

the inventory and increase sales and profit margins. To achieve this, it is of paramount

importance to establish good and effective interaction between customers and customer-support.

When a customer orders a product online, it is essential that the store establishes whether the

products are in stock and the nearest stores to where the customers are. Currently, the needs of

the customers are unlikely to be effectively met. Hence, the stores are unlikely to provide

desirable products to customers even when the inventory is high. This paper investigates this

issue at a typical and popular retail store in Vietnam. The authors present an investigation of this

issue through two main stages. Corpus analysis for a set of collected text messages, posted on the

stores’ websites between customers and customer-support, was firstly carried out to explore the

lexical patterns that indicate the customers’ needs. This analysis revealed the frequency of

customers’ requests of the stores’ locations where they can buy the goods and/or whether the

goods are in stock. In the second stage of the investigation, the valuable findings from the corpus

analysis were used for data extraction based on a Named Entity Recognition (NER) software.

The NER recognizes entities including locations and names.

Key words: customer satisfaction, text messages, Named Entity Recognition (NER)

1. Introduction

The increase of sales and profit margins is a major objective of the service retail industries. Good

customer services, i.e., a good interaction between customers and customer-support is considered

as playing the most important role to achieve the goal. Unlike the past when offline stores are

common, e-commerce becomes popular and wide-spread nowadays. Customers, therefore, used

to shopping online to contact stores mainly through websites and smart phone Apps (Ek et al

2011). However, there has been difficulties when a customer orders a product online. The

problems concern the store searching for the products to establish whether the products are in

stock and establish the nearest stores to where the customers are. The process often takes a while

when adopting current computer-based systems. In many cases, the needs of customers are not

met.

There is an evident concern in the service retail industries on how to reduce the inventory to

increase sales and profit margins (Sirimanna and Gunawardana 2020). In order to achieve this

goal, the ideal scenario relates to a good interaction between customers and customer-support.

The context of this research investigation involved popular retail stores where the importance of

this interaction is clearly highlighted.

As e-commerce becomes increasingly popular and wide-spread, customers have been considered

as online shoppers who contact stores mainly through websites and smart phone Apps. However,

when a customer orders a product online, it takes a while for the store to search for the products

to establish whether the products are in stock and establish the nearest stores to where the

customers are. Such activity, when adopting current computer-based systems, would take time

and in many cases, the needs of customers are not met. As a result, the stores are unlikely to

provide desirable products to customers even when the inventory is high. The increasing

customers’ complaints can be found in the customer support chat box in the company’s website.

The current research investigates this issue at a typical and popular retail store in Vietnam and

how a Named Entity Recognition (NER) software supports quicker and more efficient processes

of buying and selling goods. This research supports the utilization of text-messages (Thurlow

and Poff 2011) as means for good customer services in service sectors.

2. Literature Review

Computer-mediated communication (CMC) is the communication medium individuals use to

establish relational and social meaning, making use of technological advances. There are many

styles used for communications online, such as: emails, blogs, messaging apps online

conferencing, emails, etc. The term CMC refers to all these communications online or digital

platforms used for human interactions (Sheblom 2020, Altohalmi 2020).

The question related to how CMC impacts different aspects of human life, either individual

communications or organised group communications, for pleasure business or scholarly

activities, has been explored by researchers (December, 1996). CMC was evidenced to

facilitated effective learning experience in an environment where constructive alignment learning

philosophy can be implemented. This is with particular reference to designing learning

experiences for learners of second languages (Romiszowski and Mason 2004). CMC enables

users to enhance their interactions and exchange of ideas and opinions with other people of

similar interest and learning goals with similar interest in such conversations.

CMC tools can be synchronous or asynchronous. Synchronous CMC mode allow users to have

real time interaction and discussions, such as real time meeting rooms, video conferencing, etc.

This is similar to spoken interaction (Sykes 2005)

Asynchronous CMC does not happen in real time. However, it is based on recorded or published

information, which users access in their own time and not in real time. The time lag between

receiving the CMC and contributing an answer, allows the user to consider their answer

thoughtfully. Users are also able to reread or re-watch the recorded communication anytime they

wish (Altohami 2020, Newhagen 1996).

According to Berry (2004), the main advantages of asynchronous communication media

synchronous communication media include flexibility over time and distance combined, more

active and equal team member participation, the ability of users to reflect or collect data before

responding. The next section discusses how text messages are considered as one of the most

popular CMC modes that reflects the mentioned advantages.

2.1 Text messages

It is evident that the use of text messaging, referred to as SMS and texting, has gained huge

popularity in the last three decades (Altohami 2020). Texting has become appealing because it is

cheap, personal and unobtrusive. There are some issues related to clear understanding of

evolving texting messages which could result in mis-understandings (Thurlow and Poff 2011).

Despite these issues, text messages are effective means for social interactions such as greetings,

exchanging ideas, as well as providing help and support (Crystal 2008). Text messaging is also

used to organise real time synchronous interactions with families, friends, and colleagues. The

instant messaging (IM) technology offer users the ability to chat, over the internet, in real time

using text (Larson 2011).

Sirimanna and Gunawardana (2020), observed the development of technology noting that the

instant messaging (IM) apps can now allow the users to transmit texts but also images and files.

Such files are usually sent through the chat app as hyperlinks, or voice, or video. The instant

messaging is now prolific at the workplace as it allows effective and quick messages

communications.

Cho et. al (2005) indicate that IM can support better work performance of employees based on

the quality of communication and trust though it could interrupt work in some cases. IM,

therefore, is effective in enhancing other CMC tools. A more effective and comprehensive work

communication environment can this be created for the employees.

It is evident that previous studies related to CMC were concerned with the effectiveness of such

medium and the level of satisfaction of the user (Giri and Kumar 2010, Mukahi et. al 2003).

There seems to be a dearth of information related to previous research examing the impact of

CMC on customer satisfaction of how Named-Entity-Recognisation (NER) is applied. The

following sections will outline previous research efforts investigating NER.

2.2 Previous research into NER

There seems to be little published research efforts in relation to NER in constrained

environments. Jiang et al. (2010) investigated Hidden Markov Models (HMM), an SMS database

of 1000 messages. Named Entities related to events and activities were extracted from Chinese

SMSes. Such findings were specifically aimed at use with handsets.

The research carried out Polifroni et al (2010) adopted logistic regression to recognise named

entities, such as: time, date, location, from spoken and typed messages. This research was

carried out in a controlled laboratory setting where the SMSes and spoken words of the

participants were transcribed to establish a database. The research effort was focused on

automatic speech recognition for mobile phones.

Hård af Segerstad (2002) provided the analysis of linguistics of Swedish test messages

characteristics and usage of SMS in Sweden. The research findings showed interesting aspects

related to the use of abbreviations borrowed from other languages and not known in the Swedish

language.

Ek et al (2011) investigated the implementation of the NER application for text messages in

Swedish on a mobile flatform. The research supports the ideas that the extraction of entities

including names, locations, dates, time and telephone number could be used by other application

used on the phones.

In Vietnam, there is little research on NER. Pham (2020) discusses VLSP 2016 as a Vietnamese

training set established by Association for Vietnamese Language and Speech Processing-

vlsp.org.vn). It provides two data sets: Named Entity recognition and Sentiment analysis. Four

named-entity categories were adopted in VLSP 2016. These are similar to those considered in

the shared-task of CoNLL 2003. They are person (PER), organization (ORG), location (LOC),

and miscellaneous entities (MISC).

There are two approaches for machine learning. One approach is a conventional machine-

learning methodology and the other is deep-learning one. Such models and methodologies can

be applied to NER system in VLSP2016 such as Hidden Markov Models, Support Vector

Machines, Conditional Random Fields (CRFs), Maximum-Entropy-Markov Models (MEMMs)

or recurrent neural network (RNN) with LSTM units (Li et al. 2020, Pham 2019), PhoBERT

(Nguyen and Nguyen 2020).

Among the above approach, BERT (the Bidirectional Encoder Representations from

Transformers) is considered as one of the most common Pre-trained language models (Devlin et

al. 2019). As an open source like Spacy and Open NLP, BERT has become popular. Huge

improvements for a variety of NLP tasks have been achieved through BERT which is now well

supported and prolifically used. The currently available BERT tools have been mainly applied

for English language communications. However, the architecture of such BERT tools could be

retrained or existing pre-trained multilingual BERT-based models can be employed (Devlin et al.

2019; Lample and Conneau 2019).

In this study, the authors use BERT to name entities. NER software in this research recognizes

entities including locations and names. This NER supports quicker and more efficient processes

of buying and selling goods at a typical and popular retail store in Vietnam.

3. Methodology

The study aims to establish good and effective interaction between customers and customer-

support through analysing text-messages. The two research questions are:

(1) To what extent does Named Entity Recognition (NER) software support more efficient

processes to increase sales and profit margins in the retail industry?

(2) How satisfied are the customers with the support of the NER?

In order to answer the two research questions, the study includes two main stages and a pilot

evaluation. Stage 1 relates to corpus analysis for a set of collected text messages to explore the

lexical patterns that indicate the customers’ needs. In the second stage of the investigation, the

valuable findings from the corpus analysis were used for data extraction based on a Named

Entity Recognition (NER) software including locations and names. Finally, a pilot evaluation of

customers’ satisfaction of the revised app is carried out based on the data collected via

customers’ comments on website.

3.1 Stage 1

In the first stage of the research, the authors collected messages exchanges (Altohami 2020)

between customers and customer-support on the website of the company including chat box

support, comments on the website and chat application such as Zalo, Messengers. Messages were

short conversations, mainly on customers’ asking whether products they wanted to buy are in

stock, the addresses of the stores where they can collect the products or providing their own

addresses for deliveries. The messages also related to customers’ complaints on products,

supporters’ replies identifying the addresses of the buyers before transferring the issues to

departments which have responsibilities.

These messages were written in Vietnamese, some of which were common ‘borrowed’ English

such as ‘thanks’, ‘inbox’, ‘chat’. These English words were often seen among young customers.

There were about 3500 text messages collected. The data collection took into consideration any

ethical and confidentiality issues. The researchers applied the company’s policy of keeping

customers’ information confidential. The data covered aspects of customer-customer support as

it was collected from all the means of communication to customers offered by the company. We

annotated the dataset with the categories of name entities: date, time, product, location and

address.

Text messages, posted on the stores’ websites between customers and customer-support were

collected from December 2021 to February of 2022. This is the time when the needs of buying

products increase due to the Tet holiday (the biggest holiday in Vietnam). This period was

chosen as the demand for effective customers support services increases and affects the retailing

process.

Figure 1. The two stages of the research

We divide our system into two components as it is shown in Figure 1. The first component

consisted ofreallife text-based communication core, which was collected from three sources:

mobile chat application, chatbox support on website, and comments on websites. This

component of the system was utilised to generate the datasets for implication of the second stage.

Secondly, with the collected raw dataset, some issues concerning spelling mistakes, redundancy,

or punctuation need to be addressed before proceeding the data. In details, the issues that need to

sort out include:

The first issue concern Unicode application. Vietnamese applies two types of Unicode which

include two different byte (UTF-8). This causes difficulties to proceed the text. For example:

b'Ti\xe1\xba\xbfng Vi\xe1\xbb\x87t’

b'Tie\xcc\x82\xcc\x81ng Vie\xcc\xa3\xcc\x82t’

The difference in showing the tone (e.g. ‘oè’ and ‘oè’) which could lead to the difference in the

meaning. The solution for this is the use of Normalization Form C (a defined normalization of

Unicode strings which make it possible to determine whether two Unicode strings are equivalent

to each other-Wiki Media) to standardize one typy of Unicode.

The second issue relates to the error of using space. This is solved by space deletion. The third

issue concern the error of writing the name in lower case which could convey different

meanings. For example, ‘hồ Xuân Hương’ means ‘the lake named Xuân Hương’ insteads of ‘Hồ’

as sirname. The solution for this issue is to set lower case for all texts.

3.2 Stage 2

In the second stage of the research, the text messages corpus collected in stage 1 is used as an

input. With 3500 conversations, there are about 15000 common words in Vietnamese including

addresses, customers’ names, names of products and materials. The recognition procedure was

carried out using Wordpiece.

Wordpiece is a tokenizer method which is used to split a sentence into parts including words in

this study. The common issue is that some of these words can not be found in dictionaries, which

are called UNK words. In comparison with old versions, e.g., BPE PhoBERT (Nguyen and

Nguyen 2020). Wordpiece helps to solve the problem by splitting words which have high

frequency into subcategories (clusters, vowels and consonants) (Figure 2). This method helps to

have UNK words as subwords which can be easier to predict the meanings.

Figure 2. Wordpiece spliting sentences into parts

For instance, ‘cty Thế Giới Di Động’ is splited to ‘cty’, ‘thế’, ‘giới’, ‘di’, ‘động’. If ‘cty’ (means

‘công ty’-company) can not be found in dictionaries because it is written in shorthand technique,

Wordpiece would help to have two subwords ‘c’ and ‘ty’. The meaning can be revealed from the

two tokens.

The other example concerns the lack of space from customers’ texting ‘hiện gà này ở bhx chợ

phú thọ trảngdàibiênhòa đn có không ạ’, Wordpiece will split all the strings into words ‘hiện, gà,

này, ở, bhx, chợ, phú, thọ, trảng,##dài,##biên,##hòa, đ,#n, có, không, ạ’ to proceed for the

meanings.

After splitting words, the statistics is carried out to find out the frequency of the 15.000 words.

This is to build the list of words that are able to merge to each other, as it is shown in Figure 2.

above.

Label engine (Lafferty et al. 2008) is the next step in this stage. In this step, the list of words

provided from the step of tokenizer engine and is given labels as it is shown in Table 1 below:

[2]

Label Description Examples

1 B-LOC Begin Location Chung [B-LOC]

(The intial of ‘apartment building’: chung

cư)

2 I-LOC Inside Location Cư [I-LOC]

(The ending of ‘apartment building’:

chung cư)

3 B-ADD Begin Address Alexander [B-ADD]

4 I-ADD Inside Address de [I-ADD]

Rhodes[I-ADD]

5 B-PROD Begin Product Nước [B-PROD]

(The intial of ‘fish sauce’:nước mắm)

6 I-PROD Inside Product Mắm [I-PROD]

(The ending of ‘fish sauce’: nước mắm )

0 O Words which do

not belong to the

above groups

Chai [O]

bottle

Table 1. Lable classification

There are 3 groups of label classification: Location, Address, Product. Each group includes 2

subgroups: B- (begin), I-(inside) (Pham 2018). The last group 0 includes words which do not

belong to the first six groups. This follows a common tagging format for tagging tokens in

computational lunguistics presented by Ramshaw and Marcus (1995). The B-prefix before a tag

indicates that the tag is the beginning of a block. The tag is shown inside a chunk with I- prefix

before a tag. An O tag indicates that a token belongs to no block.

The third step is the step of mapping tokens to labels. For example, “Một chai nước mắm đến

chung cư đường Alexander de Rhodes” is encoded as below (Table 2)

Token

Một chai nước mắm đến Chung Cư đường Alexande

De Rhode

IDs 11 13 15 16 8 56 34 17 88 98 332

Labels O O B-

PRO

I-PRO

O B-

LOC

I-

LOC

O B-ADD I-

ADD

I-ADD

IDs 0 0 5 6 0 1 2 0 3 4 4

Table 2. Mapping tokens to labels

The combination of each couple of token ID and label ID is call an offset. A number of offsets

form an encoded sentence. For example, these offsets from the table x.x [(11,0),(13,0),

(15,5),(16,6),(8,0),(56,1),(34,2),(17,0),(88,3),(98,4),(332,4)] makes the sentence “Một chai nước

mắm đến chung cư đường Alexander de Rhodes”. A dataset contains such encoded sentences. In

this study, 12250 sentences from 3500 conversations are proceeded through this process.

In the final step, the dataset is input to the NER engine. Tokens are recognized to classify

according to the labels such as B-LOC, I-LOC, B-ADD, I-ADD, B-PRO, I-PRO, O. For

example, the encoded input of “Hồ Xuân Hương” will be classified “Hồ” as B-ADD, “Xuân” as

I-ADD, “Hương” as I-ADD. Then, the merging process will happen to form the entity of address

“Hồ Xuân Hương”. The similar processes are applied to identify the entities of location and

product.

3.3 A pilot evaluation on customer satisfaction

A pilot experiment was carried out for a preliminary evaluation of customer satisfaction. The

proposed chat facility within the NER Application, was used for a pilot investigation for a

duration of two weeks. Around 600 conversations in the chatbox were collected during this time.

The messages were mainly related to customers asking for the products they wanted to buy and

the nearest stores to where the customers were. These messages were reviewed to establish the

overall satisfaction of the customers who chatted with customer-support.

In order for the feedback of the customers on using the revised Application, an online feedback

form was designed and attached to the website so that the customers can fill it in after the

conversation exchanges with customer-support. The form contains 5 main questions which

attempted to quantify the customer’s self-assessment of their use of the revised chatting app.

These questions were designed to establish out how much the customers were satisfied with the

customer support service. 387 out of 600 customers completed the online feedback form.

Participants answered the questions on Likert’s scale that describes their choice from 1 (very

bad) to 5 (very good). One open-ended question (question 6) for aspects of triangulation of the

answers and further comments was used.

These questions in the feedback form are as follows:

1. How much do you like the chatting app?

2. How fast do you have the answers for the products you want to buy?

3. How fast do you receive the answers for the nearest stores?

4. How much are the suggestions on the website appropriate to your needs?

5. What scale do you choose to rate the level of service satisfaction? (from 1 very bad – 5

very good)

6. Which Application version do you prefer? The old version or the new version?

4. Results and discussion

In this section the data collected from corpus analysis and NER software (Larson 2011, Pham

2018) will be explored to find out how NER support good interaction between customers and

customer-support. In the first part, locations refer to customers’ addresses, as provided by them,

to establish the nearest store to them which can supply the ordered products. The locations

database, therefore, also refers to the stores’ addresses, thus linking the customers to the nearest

locations of stores. Names database refers to the products and the names of customers. In the

second part of the section, the customers’ satisfaction with the support of the NER will be

discussed.

4.1 NER software supports more efficient processes to increase sales and profit margins in the

retail industry

4.1.1 The common entities reflecting customer’s needs

The entities of ‘products’ and ‘locations’ appear in high frequency in the text messages, 96 %

and 89% respectively. Table. 3 indicates that the customers who contact to buy products often

enquires about where they can find the products they want. Customer support, in return, ask the

customers for their addresses to serve and meet their needs.

Details Frequency Labels 3500 for

total

Sample date Friday 79% DATE 2765

Sample time 5.PM - 10.PM TIME

Sample

telephone no

+092.xxxxx PHONE

Product Bò Úc, rau , sản phẩm 4kFarm, sữa

Vinamilk

(Australian beef, vegetable, 4kFarm,

Vinamilk)

96% PRODUCT 3360

Location Gần trường học, Vinhome, Nhà văn

hoá thanh niên.

(near schools, Vinhome, behind Youth

Cultural House)

89% LOCATION 3115

Address Phường Tỉnh quận huyện thành phố

(Ward, District, City)

83% ADDRESS 2905

First name &

Titles

Cô Lê, Chú Ba, Anh Bảy, chị Hai

chủ nhà, chủ quán kế bên nhà (house

owner, the shop owner who lives next

door)

72% NAME 2520

Table 3. The frequency of common entities

The sample data indicated that 89% of customers provided their location while 83% provided

their address as well. Customers provide their addresses to be supported regarding the stores

where they can find the products they need. This shows that these entities appear in almost all

customers to customer-support conversations. Interestingly, these entities appear together with

high frequency.

4.1.2 Date and time related to customers’ purchases and needs

Table 4. shows that customers often enquired about ordering food around 10 a.m-1p.m and 6

p.m-11p.m in weekdays and 9 a.m to 11 p.m at weekends and holidays. The required Products,

mainly fish, meat, eggs, milk, vegetables and fruits, were quite varied in weekdays.

Date Time Product

Monday 10AM - 1PM,

6PM - 10PM

Thịt, cá, trứng, sữa, rau tươi, trái cây

(Meat, fish, eggs, vegetables, fruits)

Tuesday 10AM - 1PM,

6PM - 10PM

Thịt, cá, trứng, sữa, rau tươi, trái cây

(Meat, fish, eggs, vegetables, fruits)

Wednesday 11AM - 1PM,

6PM - 10PM

Thịt, cá, trứng, sữa, rau tươi, trái cây

(Meat, fish, eggs, vegetables, fruits)

Thursday 11AM - 1PM,

6PM - 11PM

Thịt, cá, trứng, sữa, rau tươi, trái cây

(Meat, fish, eggs, vegetables, fruits)

Friday 6 PM - 12 PM Hải sản, thịt, rau củ

(Seafood, meat, vegetables)

Weekend 9AM - 11 PM6 Bánh kẹo, nước ngọt, trái cây, đồ dùng một

lần (Sweets, soft drink, fruits, disposable

products)

Holiday 9AM - 11 PM6 Bánh kẹo, nước ngọt, trái cây, đồ dùng một

lần (Sweets, soft drink, fruits, disposable

products, disposable products)

Thịt & hải sản

(Meat & sea food)

Table 4. Types of products acccording to date and time

Evidently, the customers enquires are varied according to date and time. For example, at

weekends, entertainment related food such as beer, soft drinks, cakes, sea food as well as

disposable products have higher frequency of orders, particularly from the suburbs of Ho Chi

Minh City (HCMC).

4.1.3 Products – Location relationship related to customers’ needs

The analysis of the collection of the written texts (Altohami 2020) with Wordpiece showed that

common products such as meat, fish, eggs, vegetables, fruits, milk, cheese, yogurt, cereal are

similarly required by customers from rresidential areas, office buildings and industrial parks

(Table 5). This demonstrates the high needs for these products.

Residential areas

● High population areas

e.g. Bình Thạnh, Phú Nhuận, Gò

Vấp,

Tân Bình, Tân Phú districts.

● Apartment buildings

e.g. Vinhome Center Park, Vinhome

Grand Park

● Dormitories

e.g. University Dormitory in Thu

Duc)

Office buildings & Industrial Parks

● District 1, 3

● High Tech Park

● Quang Trung Software City, district

● Tan Tao Industrial Park, Tân Binh

District

Vinh Loc Industrial Park, Binh Chanh

district

Item % Item %

Đồ vệ sinh cá nhân (Toiletries)

Nước rữa tay (Hand sanitizer)

Khẩu trang (Face Mask)

Đồ dùng 1 lần (Disposable items)

Nước rữa tay và khăn giấy (Hand

sanitizer and tissue)

Đồ khô (instant noodle)

- Gạo và ngũ cốc (Rice & cereal)

- Đồ chay (Vegan food)

- Gia vị (Flavour)

- Sweet food (Bánh kẹo)

Thức ăn nấu sẵn (Ready-to-eat food)

Snack & Sandwich

Bánh kẹo (Sweets)

Mì ly, xúc xích (Instant noodle and

sausage)

Đồ uống (Beverages)

Rượu, bia, nước ngọt (wine, beer, soft

drink)

Nước khoáng (Mineral water)

Nước yến (bird net)

Trà và cà phê (Coffee & tea)

Nước uống sô cô la (Chocolate drink)

Đồ uống (Beverages):

Rượu, bia, nước ngọt (wine, beer, soft

drink)

Nước khoáng (Mineral water)

Trà và cà phê (Coffee & tea)

Thịt, cá, trứng, rau,

Trái cây (Meat, fish, eggs, vegetables,

fruits)

Trái cây (fruits) 65

Sữa, phomai, yaourt, ngũ cốc

(Milk, cheese, yogurt, cereal)

Sữa, phomai, yaourt, ngũ cốc

(Milk, cheese, yogurt, cereal)

Đồ đông lạnh (frozen food) 38

Đồ đóng hộp (canned food) 23

Table 5. Products – location and location-products

However, customers who live in areas where there are more apartment buildings (e.g. Vinhome

Centre Park in District 2, Vinhome Grand Park in district 9, Tan Binh and Tan Phu where the

population is high) requires toiletries more. Ready-to-eat food, disposable products are required

more in areas where there are more office buildings and industrial parks. It is worth noting that

the requirement of all products is more from the suburbs such as Thu Duc rather than from the

centre of the city (district 1,3). This reflects the recent trend of expansion of the residential areas

from the center to the suburbs of HCMC.

Based on such information related to the required products and locations, which was established

with the use of NER (Li et al. 2020), effective suggestions to match customers to products and/or

location could be provided. This is crucial to provide good customer service responding to

customer needs.

4.1.4 The use of lexical features for good interaction between customers and customer-support

In this section a number of lexical features of the communication data collected for this research,

such as, titles prefixing people’s names, will be explored (Table 6).

Titles prefixes

In Vietnamese, it is important to show politeness by using the titles. ‘Anh’ for male customers

and ‘chị’, ‘em’ for female customer are most commonly used as they are for the majority of

customers. In the case that the customers are known to be older people, ‘cô’ ‘chú’. ‘bác’ are

used.

Shorthand techniques

Common shorthand techniques which are used by customers are: Tks = thanks, ac = anh chị, e =

em, k/ko = không(can’t), dc = được(can), hsd = hạn sử dụng(expiry), while NSX (Ngày sản xuất

(Manufatured date), HSD (Hạn sử dụng (expire date), SL (Số lượng, số lô (quantity, Lot number)

are often used by customer support.

Common Emojis and Emoticons. Customers usually expressed their positive feelings or negative

feelings by using emojis and emoticons, mainly for happy (:-)��) or sad (:-( ☹).

Greetings and Questions. Some questions that are often seen in the collected text messages are:

“Ac có đang inbox ko?”, “Allo ac” (Are you inbox?, Allo?). In many cases, these questions are

considered as greetings.

Lexical features Customers Customer support

Greetings Chào ac Chào anh/chị

Titles prefixes Male: Anh Male: Anh, chú

Female: Chị Female: Chị, cô

Both: Em, tôi Both: Em, quý khách, bác

Shorthand

techniques

Tks = thanks

ac = anh chị

e = em

k/ko = không

dc = được

hsd = ha

…

Cảm ơn (Thanks)

NSX (Ngày sản xuất-

Manufactured date)

HSD (Hạn sử dụng – Expiry)

SL (Số lượng, số lô- Quantity, Lot

number)

cty (Công Ty - Company), tnhh

(Ltd)

Hedges/ Hegdes Vâng (Yeah)

ạ

dạ

ơi

Vâng (Yeah)

ạ

dạ

Emoticons Happy:

happy :-) or ‘hehe’

Happy:

happy :-) or ‘hehe’

Unhappy:

sad :(, confused =/, cool

B-)

Sorry feelings when the needs of

customers are not met yet

:(( :-((

Table 6. Lexical features

It results in the high frequency of the replies such as ‘Dạ có ở cửa hàng Nguyen Dinh Chieu, gần

chổ chị” (Yes, we sell them in Nguyen Dinh Chieu near your place) than ‘Xin lỗi, hiện chúng tôi

hết hàng’ (‘Sorry, we are out of it) or ‘Anh/chị chờ một chút ạ’ (Please can you wait for a while’)

from customer supports. The more the replies were ‘Yes, (có) than ‘Không có’ (‘No’), the more

happy emoticons were shown in customers’ messages.

In summary, there are two common questions the customers made: (1) whether the products they

need are in stock (products-location) and (2) providing their addresses to request the nearest

stores’ locations where they can buy the products (location- products).

As for products-location, the results show that, as expected, customers require products close

their locations. Generally, all customers enquires were related to, not just the products but, as

expected, to the nearest store to them. The findings reveal that with the support of NER engine

(Nguyen and Nguyen 2020), when the needs of customers increased, the retail system of the

company was able to meet such needs effectively as evident by the high frequency of the replies

of ‘Yes, (có) than ‘Không có’ (‘No’) and high frequency of more happy emoticons (happy :-) ) in

the customers’ messages.

As for location-products, if the customer’s desired store is out of stock of the product, the

proposed NER system could provide appropriate alternative stores where the product is in stock.

This resulted in significant reduction of complaints from the customers in relation to clarity of

product availability in different locations. NER has supported the process of searching for the

addresses and give it to the delivers to make sure products can be delivered to the customers

quickly.

In addition, the research data included evidence of the current trend of expansion of residential

areas from the center to the suburbs of HCMC. Hence, the proposed NER system will provide

appropriate arrangements of products and delivery to meet customers’ needs in the wider suburbs

of HCMC.

4.2 The customers’ satisfaction with the support of the NER

In this section, the results from a pilot evaluation of customers’ satisfaction are discussed. As it is

shown in table 7, the majority of customers found that the proposed App is supportive in helping

them to buy the products they want.

The percentage of customers who chose “5” or “4” (A great deal/Good) to the question related to

“how fast they have the answers” was 60%. Similarly, the percentage of customers who chose

“5” or “4” (A great deal/Good) to the question related to “how much the suggestions on the web

are appropriate to your needs” was 80%. The responses of “How much do you like the app”

showed that the majority of the customers, total of 86%, responded with either 5 or 4 (a great

deal) similar to the chatting app.

As for the last question related to the customers’ preference of the version of the app (old or new

version), 90% of the customers said they preferred the revised app. The reasons for their answers

were as follows:

“I prefer the new version because I received quick answers. It makes me feel I am cared for.”

“I prefer the new version because it is very friendly and natural. I don’t feel like I am talking

with a machine.”

“I often receive the right products to buy and can find the right shops easier.”

A little/ Bad Average A great deal/Good

1 2 3 4 5

How much do you like the

chatting app?

(1%)

(12.9%)

100

(25.8%)

153

(39.5%)

180

(46.5%)

How fast do you have the

answers for the products you

(9%)

(11.6%)

(12.9%)

100

(25.8%)

157

(40.6%)

want to buy?

How fast do you receive the

answers for the nearest

shops?

(11.6%)

(12.9%)

(18.1%)

172

(44.4%)

How much are the

suggestions on the website

appropriate to your need?

(2.6%)

(8.2%)

(22.7%)

120

(31%)

130

(33.5%)

What scale do you choose to

rate the level of service

satisfaction?

(1%)

(14.5%)

(16.3%)

(25.3%)

166

(42.9%)

Table 7. The customers’ feedback on using the revised app

5. Conclusion and Recomemdation

The research showed that customers had strong positive responses to the proposed App. This

suggests that NER system can provide more effective practices for interaction between

customers and customer-support customer-support. Therefore, the design and implementation of

such apps is recommended. This leads to the following conclusion.

The Named Entity Recognition (NER) system, which is proposed by the authors in this paper,

was found to increase the efficiency of the retail process in the age of increased popularity and

wide-spread of e-commerce. The effectiveness was evident in both of the investigated aspects of

the process, namely: location-products and products-location. The NER system provided a

platform for quick and effective customer interface to find the products of interest as well as the

nearest store location where the products are in stock. The user interface of NER provides the

first option of location-products with output statements related to whether the product is

available in stock at the location of the customer. In the situation when the product is not

available at the customer’s location, NER data engine switches to the second option of products-

location. This option integrates the database to list all possible store where the product is

available in stock.

The authors are now seeking collaborations from various e-commerce retail sectors in various

geographical locations to expand this research with a view to refine a targeted database linking

customers to their preferences and needs. This aspect of further research will employ tools form

Artificial Intelligence (AI).

Limitation

The paper was limited in several ways. Firstly, the data was collected mainly through the

company website. Though it is a major source to communicate with the customers at the

company, the authors would like to explore the research issues in other means of

communication, e.g. email. However, this kind of information is unable to access because

it is internal company documents.

Secondly, the data is limited in one company. In order for the better validity and

generalization of the findings, other company data should be incooperated in the

development of the research in the future.

This research is partly funded by University of Economics Ho Chi Minh City (UEH), Vietnam.

References

Altohami, M.A Waheed (2020), Text messages: A computer-mediated discourse analysis.

International Journal of Advanced Computer Science and Applications, Vo.11, No.7,

pp.79-87.

Berry, G. R. (2004). Lessons from the Online Teaching Experience. Journal of the Academy of

Business Education, 5, 88-97.

Cho, H.-K., Trier, M. and Kim, E. (2005) The use of instant messaging in working relationship

development: A case study, Journal of Computer -Mediated Communication, 10, 4:

http://jcmc.indiana.edu/vol10/issue4/cho.html

Customer satisfaction. (2023, Jan 05). Wikipedia.https/wikipedia.org/wiki/Customer_satisfaction

Crystal, D. (2008). Txtng: The gr8 db8. New York: Oxford University Press.

December, J. (1996). Units of analysis for Internet communication. Journal of Computer-

mediated Communication, Vol. 1, No. 4.

Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.(2019). Bert: Pre-training of deep bidirectional

transformers for language understanding. In Proceedings of NAACL, pages 4171-4186.

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv

preprint, arXiv:1412.6980.

Ek T., Kirkegaard C., Jonsson H., Nugues P. (2011). Named Entity Recognition for Short Text

Messages. Social and Behavioral Sciences 27 (2011), pp. 178 – 187

Giri, V. N., & Kumar, P. (2010). Assessing the Impact of Organizational Communication on Job

Satisfaction and Job Performance. National Academy of Psychology (NAOP) India.

Hård af Segerstad Y . (2002). Use and adaptation of written language to the conditions of

computer-mediated communication. Doctoral thesis, Göteborg University.

Jiang H., Wang X., & Tian J. (2010). Second-order HMM for event extraction from short

message. In Proceedings of NLDB, pages 149–156.

Lample, G., & Conneau, A. (2019). Crosslingual Language Model Pretraining. In Proceedings of

NeurIPS, pages 7059–7069.

Larson, G. W. (2011). Instant Messaging. Ensiclopedia Britannica.

Lafferty, J., McCallum, A., & Pereira, F. (2008). Conditional Random Fields: Probabilistic

Models for Segmenting and Labeling Sequence Data. Semantic Scholar. 282-289.

Li, P. H., Fu, T. J., & Ma, W. Y. (2020). Why Attention? Analyze BiLSTM Deficiency and Its

Remedies in the Case of NER. In AAAI '20. 8236--8244.

Mukahi, T., Nakamura , M., & Not, R. D. (2003). An Empirical Study on Impacts of Computer-

Mediated Communication Management on Job Satisfaction. Adelaide, South Australia: 7th

Pacific Asia Conference on Information Systemss.

Named-entity recognition. (2023, Jan 05). Wikipedia. https://en.wikipedia.org/wiki/Named-

entity_recognition

Newhagen J. E., S. R. (1996). Why communication researchers should study the internet: A

dialogue. Journal of Communication. Vol. 46, No. 1, pp. 4-13.

Nguyen, D. Q., & Nguyen, A. T. (2022). PhoBERT: Pre-trained language models for

Vietnamese. ACL Anthology.

Pham, P. Q. M. (2018). A feature-rich vietnamese named-entity recognition model. arXiv

preprint arXiv:1803.04375.

Polifroni J., Kiss I., & Adler M. (2010). Bootstrapping named entity extraction for the creation of

mobile services. In Proceedings of LREC.

Ramshaw, L. A. and Marcus, M. P. (1995). Text Chunking using Transformation-Based

Learning". Computation and Language. Vol.11, pp.82–94.

Romiszowski, A. and Mason, R. (2004). Computer-mediated communication. In Handbook of

research on educational communications and technology, D. H. Jonassen, Ed. Mahwah,

NJ: Lawrence Erlbaum Associates, pp. 397-431.

Sirimanna, U.I. , and Gunawardana, T.S.L.W. (2020). Impact of Computer Mediated

Communication Systems on Job Satisfaction: Employees in the Transmission Division of

Ceylon Electricity Board, Sri Lanka. The 9th International Conference on Management and

Economics.

Sykes, J. M. (2005). Synchronous CMC and pragmatic development: Effects of oral and written

chat. CALICO Journal, Vol. 22, No. 3, pp. 399-431.

Thurlow, C. and Poff, M. (2011). Text messaging. In Handbook of the pragmatics of CMC, S. C.

Herring, Stein, D., Virtanen, T. , Ed. Berlin and New York: Mouton de Gruyter.

Additional Readings

Chiu, J., and Nichols, E. (2016). Named entity recognition with bidirectional LSTM-CNNs.

Transactions of the Association for Computational Linguistics.

Cui Y., Che, W, Liu, T., Qin, B., Yang, Z, Wang S., and Hu, G. (2019). Pre-Training with Whole

Word Masking for Chinese BERT. arXiv preprint, arXiv:1906.08101.

De Vries, W., Van Cranenburgh A., Bisazza A., Caselli T., Van Noord, G., and Nissim, M.

(2019). BERTje: A Dutch BERT Model. arXiv preprint, arXiv:1912.09582.

Huang, Z.; Xu, W.; and Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging.

arXiv preprint arXiv:1508.01991.

Lin, B. Y.; Xu, F.; Luo, Z.; and Zhu, K. (2017). Multi-channel BiLSTM-CRF model for

emerging named entity recognition in social media. In Proceedings of the 3rd Workshop on

Noisy User-generated Text.

Ma, X., and Hovy, E. (2016). End-to-end sequence labeling via bi-directional LSTM-CNNs-

CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational

Linguistics (Volume 1).

Nguyen H., Ngo H., Vu L., Chan V., and Nguyen H. (2019). VLSP Shared Task: Named Entity

Recognition. Journal of Computer Science and Cybernetics, 34(4):283–294.

Nguyen, K. A., Dong, N., and Nguyen Cam-Tu. (2019). Attentive Neural Network for Named

Entity Recognition in Vietnamese. In Proceedings of RIVF.

Ratinov, L., and Roth, D. 2009. Design challenges and misconceptions in named entity

recognition. In Proceedings of the Thirteenth Conference on Computational Natural

Language Learning (CoNLL-2009).

Wu, S. and Dredze, M. (2019). Beto, bentz, becas: The surprising cross-lingual effectiveness of

BERT. In Proceedings of EMNLP-IJCNLP, pages 833–844.

Key Terms and Explanations

Named-entity recognition: a subtask of information extraction that seeks to locate and classify

named entities mentioned in unstructed text into pre-defined

categories such as person names, organizations, locations, time

expressions, quantities, monetary values, percentages.

Customer satisfaction: a term frequently used in marketing. It is a measure of how products

and services supplied by a company meet or surpass

customer expectation.

Text message: real-time text transmission over the Internet.

Dr Le Thi-Hong Vo

University of Economics Ho Chi Minh City (UEH)

Vietnam

[email protected]

Dr. Le Thi-Hong Vo hold a PhD in TEFL/TESOL from the University of Portsmouth, U.K. With

over 12 years She is particularly interested in materials design, using technology in English

teaching and how English language training can be improved with IT support, classroom

research, teacher education, global Englishes and English language communicative competence

and intercultural communication required of graduates at the workplace.

Thien Hang Tuan

Mobile World Investment Corporation

[email protected]

Thien Hang Tuan is currently a developer at Mobile World Investment Corporation and a

lecturer at Mindx Technology School. He received a Bachelor of Engineering in software

engineering from the University of Information Technology, Vietnam National University. He

worked as a researcher at Soongsil University, Korea in 2019. His work

focuses on Automated communication systems and Fault prediction systems.

Dr Ayman Yossef Nassif

University of Portmouth, U.K

[email protected]

Dr Ayman Nassif is a highly motivated educator, researcher and consultant engineer.

His teaching and research experience spans solid mechanics, materials, structural

engineering at all undergraduate and postgraduate levels. His teaching expertise

includes fire structural engineering, thermo-mechanical FE modelling, concrete

materials and technology. His industrial experience includes structural design and

supervision of construction of buildings, roads, bridges, telecommunication towers and

irrigation systems. His experience was gained in Egypt, Finland, UK, Switzerland and

Vietnam.

thienhang.com

thienhang.com

Computer-mediated communication

Computer-mediated communication

Database Languages and Entity Relationship

Data Preprocessing | Các phương pháp tiền xử lí dữ liệu (Đang cập nhật)

MongoDB Aggregation

DataMesh: Những điều cơ bản nhất bạn cần biết (Updating)

Campaign manager 360 Certification Exam