Signal Processing - November 2017 - 75

[39] G. Lin, A. Milan, C. Shen, and I. Reid, "RefineNet: Multi-path refinement
networks for high-resolution semantic segmentation," in Proc. Conf. Computer
Vision and Pattern Recognition (CVPR), July 2017.
[40] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P.
Dollár, and C. L. Zitnick, "Microsoft COCO: Common objects in context," in
Proc. European Conf. Computer Vision, 2014, pp. 740-755.

[66] S. Sukhbaatar, A. Szlam, J. Weston, and R. Fergus, "Weakly supervised memory
networks," arXiv Preprint, arXiv:1503.08895, 2015.
[67] M. Tapaswi, Y. Zhu, R. Stiefelhagen, A. Torralba, R. Urtasun, and S. Fidler,
"Movieqa: Understanding stories in movies through question-answering," in Proc.
IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 4631-4640.

[41] D. G. Lowe, "Object recognition from local scale-invariant features," in
Proc. IEEE Int. Conf. Computer Vision, 1999, vol. 2, pp. 1150-1157.

[68] D. Teney, P. Anderson, X. He, and A. van den Hengel, "Tips and tricks for visual
question answering: Learnings from the 2017 challenge," arXiv Preprint,
arXiv:1708.02711, 2017.

[42] J. Lu, X. Lin, D. Batra, and D. Parikh. (2015). Deeper lstm and normalized
CNN visual question answering model [Online]. Available: https://github.com/
VT-vision-lab/VQA_LSTM_CNN

[69] D. Teney, L. Liu, and A. van den Hengel, "Graph-structured representations for
visual question answering," in Proc. IEEE Conf. Computer Vision and Pattern
Recognition, 2017.

[43] J. Lu, J. Yang, D. Batra, and D. Parikh, "Hierarchical question-image coattention for visual question answering," in Proc. Advances Neural Information
Processing Systems, 2016, pp. 289-297.

[70] D. Teney and A. van den Hengel, "Zero-shot visual question answering," arXiv
Preprint, arXiv: 1611.05546. 2016.

[44] L. Ma, Z. Lu, and H. Li, "Learning to answer questions from image using
convolutional neural network," in Proc. 30th AAAI Conference on Artificial
Intelligence, 2016, pp. 3567-3573.
[45] M. Malinowski and M. Fritz, "A multi-world approach to question answering about real-world scenes based on uncertain input," in Proc. Advances Neural
Information Processing Systems, 2014, pp. 1682-1690.
[46] M. Malinowski, M. Rohrbach, and M. Fritz, "Ask your neurons: A neuralbased approach to answering questions about images," in Proc. IEEE Int. Conf.
Computer Vision, 2015, pp. 1-9.
[47] C. Matuszek, N. FitzGerald, L. Zettlemoyer, L. Bo, and D. Fox, "A joint
model of language and perception for grounded attribute learning," in Proc. Int.
Conf. Machine Learning, 2012, pp. 1671-1678.

[71] I. Vendrov, R. Kiros, S. Fidler, and R. Urtasun, "Order-embeddings of images and
language," in Proc. Int. Conf. Learning Representations, 2016.
[72] O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra,
"Matching networks for one shot learning," in Proc. Neural Information Processing
System (NIPS), 2016, pp. 3630-3638.
[73] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, "Show and tell: A neural image
caption generator," in Proc. IEEE Conf. Computer Vision and Pattern Recognition,
2014, pp. 3156-3164.
[74] P. Wang, Q. Wu, C. Shen, and A. v d. Hengel, "The VQA-machine: Learning how
to use existing vision algorithms to answer new questions," arXiv Preprint,
arXiv:1612.05386, 2016.
[75] P. Wang, Q. Wu, C. Shen, A. v d. Hengel, and A. Dick, "Explicit knowledge-based
reasoning for visual question answering," arXiv Preprint, arXiv:1511.02570, 2015.

[48] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of
word representations in vector space," arXiv Preprint, arXiv:1301.3781, 2013.

[76] P. Wang, Q. Wu, C. Shen, A. v d. Hengel, and A. Dick, "FVQA: Fact-based visual
question answering," arXiv Preprint, arXiv:1606.05433, 2016.

[49] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed
representations of words and phrases and their compositionality," in Proc.
Advances in Neural Information Processing Systems, 2013, pp. 3111-3119.

[77] J. Weston, S. Chopra, and A. Bordes, "Memory networks," arXiv Preprint,
arXiv:11410.3916, 2015.

[50] K. W. Murray and J. Krishnamurthy, "Probabilistic neural programs," arXiv
Preprint, arXiv:1612.00712, 2016.
[51] H. Noh, P. H. Seo, and B. Han, "Image question answering using convolutional neural network with dynamic parameter prediction," in Proc. IEEE Conf.
Computer Vision Pattern Recognition, 2016, pp. 30-38.

[78] T. Winograd, "Understanding natural language," Cognit. Psychol., vol. 3, no. 1,
pp. 1-191, 1972.
[79] Q. Wu, C. Shen, A. v. d. Hengel, L. Liu, and A. Dick, "What value do explicit
high level concepts have in vision to language problems?" in Proc. IEEE Conf.
Computer Vision and Pattern Recognition, 2016, pp. 203-212.

[52] B. Peng, Z. Lu, H. Li, and K. Wong, "Toward neural network-based reasoning," arXiv Preprint, arXiv:1508.05508, 2015.

[80] Q. Wu, C. Shen, A. v d. Hengel, P. Wang, and A. Dick, "Image captioning and
visual question answering based on attributes and their related external knowledge,"
arXiv Preprint, arXiv:1603.02814, 2016.

[53] J. Pennington, R. Socher, and C. Manning, "Glove: global vectors for word
representation," in Proc. Conf. Empirical Methods Natural Language
Processing, 2014, pp. 1532-1543.

[81] Q. Wu, D. Teney, P. Wang, C. Shen, A. Dick, and A. van den Hengel, "Visual
question answering: a survey of methods and data sets," Computer Vision and Image
Understanding, to be published.

[54] S. K. Ramakrishnan, A. Pal, G. Sharma, and A. Mittal, "An empirical evaluation of visual question answering for novel objects," arXiv Preprint,
arXiv:1704.02516, 2017.

[82] Q. Wu, P. Wang, C. Shen, A. Dick, and A. v. d. Hengel, "Ask me anything: Freeform visual question answering based on knowledge from external sources," in Proc.
IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 4622-4630.

[55] A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, "Cnn features
off-the-shelf: An astounding baseline for recognition," in Proc. IEEE Conf.
Computer Vision and Pattern Recognition Workshops, 2014, pp. 806-813.

[83] C. Xiong, S. Merity, and R. Socher, "Dynamic memory networks for visual and
textual question answering," in Proc. Int. Conf. Machine Learning, 2016, pp. 2397-
2406.

[56] S. E. Reed and N. de Freitas, "Neural programmer-interpreters," in Proc.
Int. Conf. Learning Representations, 2016.

[84] C. Xiong, V. Zhong, and R. Socher, "Dynamic coattention networks for question
answering," arXiv Preprint, arXiv:1611.01604, 2016.

[57] M. Ren, R. Kiros, and R. Zemel, "Image question answering: a visual
semantic embedding model and a new data set," in Proc. Advances Neural
Information Processing Systems, 2015.

[85] H. Xu and K. Saenko, "Ask, attend and answer: exploring question-guided spatial
attention for visual question answering," arXiv Preprint, arXiv:1511.05234, 2015.

[58] R. A. Rensink, "The dynamic representation of scenes," Visual Cognition,
vol. 7, no. 1-3, pp. 17-42, 2000.

[86] K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio,
"Show, attend and tell: neural image caption generation with visual attention," in Proc.
Int. Conf. Machine Learning, 2015, pp. 2048-2057.

[59] D. Roy, K.-Y. Hsiao, and N. Mavridis, "Conversational robots: building
blocks for grounding word meaning," in Proc. HLT-NAACL Workshop on
Learning Word Meaning Non-Linguistic Data, 2003, pp. 70-77.

[87] Z. Yang, X. He, J. Gao, L. Deng, and A. Smola, "Stacked attention networks for
image question answering," in Proc. IEEE Conf. Computer Vision and Pattern
Recognition, 2016, pp. 21-29.

[60] K. Saito, A. Shin, Y. Ushiku, and T. Harada, "Dualnet: Domain-invariant
network for visual question answering," arXiv Preprint, arXiv:1606.06108,
2016.

[88] X. Yao and B. Van Durme, "Information extraction over structured data: Question
answering with freebase," in Proc. Conf. Association Computational Linguistics,
2014, pp. 956-966.

[61] A. Santoro, S. Bartunov, M. Botvinick, D. Wierstra, and T. P. Lillicrap,
"Meta-learning with memory-augmented neural networks," in Proc. Int. Conf.
Machine Learning, 2016, vol. 48, pp. 1842-1850.

[89] K.-H. Zeng, T.-H. Chen, C.-Y. Chuang, Y.-H. Liao, J. C. Niebles, and M. Sun,
"Leveraging video descriptions to learn video question answering," in Proc. Conf
Artificial Intelligence AAAI, 2017, pp. 4334-4340.

[62] A. Santoro, D. Raposo, D. G. Barrett, M. Malinowski, R. Pascanu, P.
Battaglia, and T. Lillicrap, "A simple neural network module for relational reasoning," arXiv Preprint, arXiv:1706.01427, 2017.

[90] P. Zhang, Y. Goyal, D. Summers-Stay, D. Batra, and D. Parikh, "Yin and yang:
Balancing and answering binary visual questions," in Proc. IEEE Conf. Computer
Vision and Pattern Recognition, 2016, pp. 5014-5022.

[63] M. J. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi, "Bidirectional attention flow for machine comprehension," arXiv Preprint, arXiv:1611.01603, 2016.

[91] L. Zhu, Z. Xu, Y. Yang, and A. G. Hauptmann, "Uncovering temporal context for video question and answering," arXiv Preprint, arXiv:1511.04670, 2015.

[64] P. Sermanet, A. Frome, and E. Real, "Attention for fine-grained categorization," arXiv Preprint, arXiv:1412.7054, 2014.

[92] Y. Zhu, O. Groth, M. Bernstein, and L. Fei-Fei, "Visual7W: Grounded
question answering in images," in Proc. IEEE Conf. Computer Vision and
Pattern Recognition, 2016, pp. 4995-5004.

[65] K. J. Shih, S. Singh, and D. Hoiem, "Where to look: Focus regions for visual question answering," in Proc. IEEE Conf. Computer Vision Pattern
Recognition, 2016, pp. 4613-4621.

IEEE SIGNAL PROCESSING MAGAZINE

SP



|

November 2017

|

75


https://www.github.com/

Table of Contents for the Digital Edition of Signal Processing - November 2017

Signal Processing - November 2017 - Cover1
Signal Processing - November 2017 - Cover2
Signal Processing - November 2017 - 1
Signal Processing - November 2017 - 2
Signal Processing - November 2017 - 3
Signal Processing - November 2017 - 4
Signal Processing - November 2017 - 5
Signal Processing - November 2017 - 6
Signal Processing - November 2017 - 7
Signal Processing - November 2017 - 8
Signal Processing - November 2017 - 9
Signal Processing - November 2017 - 10
Signal Processing - November 2017 - 11
Signal Processing - November 2017 - 12
Signal Processing - November 2017 - 13
Signal Processing - November 2017 - 14
Signal Processing - November 2017 - 15
Signal Processing - November 2017 - 16
Signal Processing - November 2017 - 17
Signal Processing - November 2017 - 18
Signal Processing - November 2017 - 19
Signal Processing - November 2017 - 20
Signal Processing - November 2017 - 21
Signal Processing - November 2017 - 22
Signal Processing - November 2017 - 23
Signal Processing - November 2017 - 24
Signal Processing - November 2017 - 25
Signal Processing - November 2017 - 26
Signal Processing - November 2017 - 27
Signal Processing - November 2017 - 28
Signal Processing - November 2017 - 29
Signal Processing - November 2017 - 30
Signal Processing - November 2017 - 31
Signal Processing - November 2017 - 32
Signal Processing - November 2017 - 33
Signal Processing - November 2017 - 34
Signal Processing - November 2017 - 35
Signal Processing - November 2017 - 36
Signal Processing - November 2017 - 37
Signal Processing - November 2017 - 38
Signal Processing - November 2017 - 39
Signal Processing - November 2017 - 40
Signal Processing - November 2017 - 41
Signal Processing - November 2017 - 42
Signal Processing - November 2017 - 43
Signal Processing - November 2017 - 44
Signal Processing - November 2017 - 45
Signal Processing - November 2017 - 46
Signal Processing - November 2017 - 47
Signal Processing - November 2017 - 48
Signal Processing - November 2017 - 49
Signal Processing - November 2017 - 50
Signal Processing - November 2017 - 51
Signal Processing - November 2017 - 52
Signal Processing - November 2017 - 53
Signal Processing - November 2017 - 54
Signal Processing - November 2017 - 55
Signal Processing - November 2017 - 56
Signal Processing - November 2017 - 57
Signal Processing - November 2017 - 58
Signal Processing - November 2017 - 59
Signal Processing - November 2017 - 60
Signal Processing - November 2017 - 61
Signal Processing - November 2017 - 62
Signal Processing - November 2017 - 63
Signal Processing - November 2017 - 64
Signal Processing - November 2017 - 65
Signal Processing - November 2017 - 66
Signal Processing - November 2017 - 67
Signal Processing - November 2017 - 68
Signal Processing - November 2017 - 69
Signal Processing - November 2017 - 70
Signal Processing - November 2017 - 71
Signal Processing - November 2017 - 72
Signal Processing - November 2017 - 73
Signal Processing - November 2017 - 74
Signal Processing - November 2017 - 75
Signal Processing - November 2017 - 76
Signal Processing - November 2017 - 77
Signal Processing - November 2017 - 78
Signal Processing - November 2017 - 79
Signal Processing - November 2017 - 80
Signal Processing - November 2017 - 81
Signal Processing - November 2017 - 82
Signal Processing - November 2017 - 83
Signal Processing - November 2017 - 84
Signal Processing - November 2017 - 85
Signal Processing - November 2017 - 86
Signal Processing - November 2017 - 87
Signal Processing - November 2017 - 88
Signal Processing - November 2017 - 89
Signal Processing - November 2017 - 90
Signal Processing - November 2017 - 91
Signal Processing - November 2017 - 92
Signal Processing - November 2017 - 93
Signal Processing - November 2017 - 94
Signal Processing - November 2017 - 95
Signal Processing - November 2017 - 96
Signal Processing - November 2017 - 97
Signal Processing - November 2017 - 98
Signal Processing - November 2017 - 99
Signal Processing - November 2017 - 100
Signal Processing - November 2017 - 101
Signal Processing - November 2017 - 102
Signal Processing - November 2017 - 103
Signal Processing - November 2017 - 104
Signal Processing - November 2017 - 105
Signal Processing - November 2017 - 106
Signal Processing - November 2017 - 107
Signal Processing - November 2017 - 108
Signal Processing - November 2017 - 109
Signal Processing - November 2017 - 110
Signal Processing - November 2017 - 111
Signal Processing - November 2017 - 112
Signal Processing - November 2017 - 113
Signal Processing - November 2017 - 114
Signal Processing - November 2017 - 115
Signal Processing - November 2017 - 116
Signal Processing - November 2017 - 117
Signal Processing - November 2017 - 118
Signal Processing - November 2017 - 119
Signal Processing - November 2017 - 120
Signal Processing - November 2017 - 121
Signal Processing - November 2017 - 122
Signal Processing - November 2017 - 123
Signal Processing - November 2017 - 124
Signal Processing - November 2017 - 125
Signal Processing - November 2017 - 126
Signal Processing - November 2017 - 127
Signal Processing - November 2017 - 128
Signal Processing - November 2017 - 129
Signal Processing - November 2017 - 130
Signal Processing - November 2017 - 131
Signal Processing - November 2017 - 132
Signal Processing - November 2017 - 133
Signal Processing - November 2017 - 134
Signal Processing - November 2017 - 135
Signal Processing - November 2017 - 136
Signal Processing - November 2017 - 137
Signal Processing - November 2017 - 138
Signal Processing - November 2017 - 139
Signal Processing - November 2017 - 140
Signal Processing - November 2017 - 141
Signal Processing - November 2017 - 142
Signal Processing - November 2017 - 143
Signal Processing - November 2017 - 144
Signal Processing - November 2017 - 145
Signal Processing - November 2017 - 146
Signal Processing - November 2017 - 147
Signal Processing - November 2017 - 148
Signal Processing - November 2017 - 149
Signal Processing - November 2017 - 150
Signal Processing - November 2017 - 151
Signal Processing - November 2017 - 152
Signal Processing - November 2017 - 153
Signal Processing - November 2017 - 154
Signal Processing - November 2017 - 155
Signal Processing - November 2017 - 156
Signal Processing - November 2017 - 157
Signal Processing - November 2017 - 158
Signal Processing - November 2017 - 159
Signal Processing - November 2017 - 160
Signal Processing - November 2017 - 161
Signal Processing - November 2017 - 162
Signal Processing - November 2017 - 163
Signal Processing - November 2017 - 164
Signal Processing - November 2017 - 165
Signal Processing - November 2017 - 166
Signal Processing - November 2017 - 167
Signal Processing - November 2017 - 168
Signal Processing - November 2017 - 169
Signal Processing - November 2017 - 170
Signal Processing - November 2017 - 171
Signal Processing - November 2017 - 172
Signal Processing - November 2017 - 173
Signal Processing - November 2017 - 174
Signal Processing - November 2017 - 175
Signal Processing - November 2017 - 176
Signal Processing - November 2017 - Cover3
Signal Processing - November 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com