Computational Intelligence - August 2017 - 26

results are slightly better than ours for indomain training, our models are able to transfer weights and train a reasonable policy for
SfxH from SfxR alone.
The most relevant comparison to our
model in terms of multi-task learning is Wen
et al. [12]. The authors achieved a maximum BLEU score of
0.48 for domain transferred data, only 52% of our 0.92, but
they worked on the TV and laptop domain, thus not allowing a
direct comparison between results. The authors reported a
semantic error of 0.04. In contrast to our work, which transfers
learnt models across domains and natural data, Wen et al.'s
experiments are based on artificially generated data. These
often do not display the same variety and complexity as naturally occurring data, which arguably shows that our AMRbased inputs allow for more complex lexical-syntactic
constructions to be learnt.

Learning from out-of-domain prior knowledge
achieves better results than learning from a single
domain only.
prior achieves better results than learning from a single domain
only. This seems to suggest that the pre-learnt weights from a
similar dataset are very valuable for the new domains. This is
particularly interesting for domains for which little training
data is available. We can also see from Table 2 that SfxH
achieves very decent performance without any in-domain data at
all, but based purely on training from SfxR. This is a remarkable result because it means that we can generate inputs for a
new domain based on no annotated training data at all. These
results clearly show the significance of using a common input
representation across domains. Abstracting away from particular
slots, such as "Kirin restaurant" or "Pacific Heights area", we
can reuse the lexical-syntactic patterns learnt in one domain in
others. Table 3 shows examples of the abstract patterns and
realizations that were transferred across domains.
Comparing with related work, Yu et al. [72] reported a
BLEU-1 score of 0.59 and a BLEU-2 score of 0.39 for RefCoco. This is substantially lower than our scores and might
reflect the increased difficulty in the scenario in Yu et al. [72]
who generated referring expressions directly from images. In
terms of navigation, most related work on the Sail data [24]-
[26] focuses on generating action sequences rather than the
actual route instructions. For spoken dialogue, Wen et al. [34]
achieve BLEU-4 scores 0.73 and 0.83 for SfxR and SfxH,
respectively, with a semantic error rate of 0.046. While these

TABLE 3 Examples of lexical-syntactic patterns learnt in one
domain then used in another.
IMPERATIVE CLAUSE CONSTRUCTION (WITH RELATIVE CLAUSE)
(e1 / event :arg0 (y / you) :arg1 (b1 / obj :mod
property :location (w / obj :op1 (o1 / on [in]) ))
:mode imperative)
Give: "CLICK THE RED BUTTON (THAT IS) ON THE WALL."
SfxR: "TRY CHINESE RESTAURANT KIRIN IN THE PACIFIC HEIGHTS
AREA."
TRANSITIVE CLAUSE CONSTRUCTION
(e1 / event :arg0 (b1 / obj :mod property) :arg1
(b2 / obj :mod property))
Gre: "THE YELLOW SPHERE (THAT IS) TOUCHING THE BLUE BOX."
SfxR: "SOURCE RESTAURANT SERVES ITALIAN FOOD."
COMPLEX NOUN PHRASE AND SPATIAL RELATION AND
T
- EMPORAL ADVERB
(e1 / obj :time (n / now) :domain (b1 / obj :mod
property :mod property :location (l / obj :op1 (n /
on [near, by]))) )
Gre: "NOW, THE BLUE CIRCLE ON THE GREEN SQUARE."
Gre: "NOW, THE GREEN BUTTON BY THE WINDOW."
SfxR: "NOW, AN INDIAN RESTAURANT NEAR PACIFIC HEIGHTS."

26

IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | AUGUST 2017

C. Results: Subjective Evaluation

To also assess the subjective quality of generated outputs, we
recruited 204 human judges from the CrowdFlower2 and
AMT3 crowdsourcing platforms to assign subjective ratings to
generated outputs. All judges were self-declared native or fluent speakers of English and rated altogether 3425 utterances
sampled randomly from a pool of 120 candidates per model.
To allow for a comparison with related work, we follow previous authors in asking judges to rate the naturalness of utterances. They were asked to agree with the statement "The utterance
is natural (i.e. could have been produced by a human)." on a scale of
1-5, where 1 is the worst score and 5 is the best. For each
dataset, we also collected an equal number of ratings for the
original human utterances to provide an upper bound for the
comparison of our systems. The results are shown in the rightmost column in Table 2. Medians are shown alongside averages
in parentheses. In a statistical analysis, we decided to focus on
the difference between out-of-domain training and training
with prior knowledge to see what effects can be gained by
having prior weights for training. Symbol * indicates statistical
significance at p 1 0.05 according to a 2-tailed Wilcoxon
signed rank test. From the analysis we can see that two of the
six comparisons are statistically significant, namely Give with
Gre prior and Sail with Give prior-both in the navigation
domain. None of the other differences are significant. We
believe that these results are encouraging in that we did not
expect all differences to be statistically significant. For example,
while no significance between in-domain and -out-of-domain
training or with prior knowledge does of course not indicate
equivalent policies or performance, it means at least that the
transfer of training data from one domain to another does not
lead to a significant deterioration of generated outputs.
The overall results correspond to the objective results. While
most of the subjective ratings are not as good as those received
2
3

https://www.crowdflower.com/
https://www.mturk.com


https://www.crowdflower.com/ https://www.mturk.com

Table of Contents for the Digital Edition of Computational Intelligence - August 2017

Computational Intelligence - August 2017 - Cover1
Computational Intelligence - August 2017 - Cover2
Computational Intelligence - August 2017 - 1
Computational Intelligence - August 2017 - 2
Computational Intelligence - August 2017 - 3
Computational Intelligence - August 2017 - 4
Computational Intelligence - August 2017 - 5
Computational Intelligence - August 2017 - 6
Computational Intelligence - August 2017 - 7
Computational Intelligence - August 2017 - 8
Computational Intelligence - August 2017 - 9
Computational Intelligence - August 2017 - 10
Computational Intelligence - August 2017 - 11
Computational Intelligence - August 2017 - 12
Computational Intelligence - August 2017 - 13
Computational Intelligence - August 2017 - 14
Computational Intelligence - August 2017 - 15
Computational Intelligence - August 2017 - 16
Computational Intelligence - August 2017 - 17
Computational Intelligence - August 2017 - 18
Computational Intelligence - August 2017 - 19
Computational Intelligence - August 2017 - 20
Computational Intelligence - August 2017 - 21
Computational Intelligence - August 2017 - 22
Computational Intelligence - August 2017 - 23
Computational Intelligence - August 2017 - 24
Computational Intelligence - August 2017 - 25
Computational Intelligence - August 2017 - 26
Computational Intelligence - August 2017 - 27
Computational Intelligence - August 2017 - 28
Computational Intelligence - August 2017 - 29
Computational Intelligence - August 2017 - 30
Computational Intelligence - August 2017 - 31
Computational Intelligence - August 2017 - 32
Computational Intelligence - August 2017 - 33
Computational Intelligence - August 2017 - 34
Computational Intelligence - August 2017 - 35
Computational Intelligence - August 2017 - 36
Computational Intelligence - August 2017 - 37
Computational Intelligence - August 2017 - 38
Computational Intelligence - August 2017 - 39
Computational Intelligence - August 2017 - 40
Computational Intelligence - August 2017 - 41
Computational Intelligence - August 2017 - 42
Computational Intelligence - August 2017 - 43
Computational Intelligence - August 2017 - 44
Computational Intelligence - August 2017 - 45
Computational Intelligence - August 2017 - 46
Computational Intelligence - August 2017 - 47
Computational Intelligence - August 2017 - 48
Computational Intelligence - August 2017 - 49
Computational Intelligence - August 2017 - 50
Computational Intelligence - August 2017 - 51
Computational Intelligence - August 2017 - 52
Computational Intelligence - August 2017 - 53
Computational Intelligence - August 2017 - 54
Computational Intelligence - August 2017 - 55
Computational Intelligence - August 2017 - 56
Computational Intelligence - August 2017 - 57
Computational Intelligence - August 2017 - 58
Computational Intelligence - August 2017 - 59
Computational Intelligence - August 2017 - 60
Computational Intelligence - August 2017 - 61
Computational Intelligence - August 2017 - 62
Computational Intelligence - August 2017 - 63
Computational Intelligence - August 2017 - 64
Computational Intelligence - August 2017 - 65
Computational Intelligence - August 2017 - 66
Computational Intelligence - August 2017 - 67
Computational Intelligence - August 2017 - 68
Computational Intelligence - August 2017 - 69
Computational Intelligence - August 2017 - 70
Computational Intelligence - August 2017 - 71
Computational Intelligence - August 2017 - 72
Computational Intelligence - August 2017 - 73
Computational Intelligence - August 2017 - 74
Computational Intelligence - August 2017 - 75
Computational Intelligence - August 2017 - 76
Computational Intelligence - August 2017 - Cover3
Computational Intelligence - August 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com