Signal Processing - November 2017 - 116

[23] K. Simonyan and A. Zisserman, "Very deep convolutional networks for largescale image recognition," in Proc. Comput. Sci. Conf., 2014.
[24] S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, and K.
Saenko, "Translating videos to natural language using deep recurrent neural networks," in Proc. Conf. North American Chapter Association Computational
Linguistics: Human Language Technologies, 2015, pp. 1494-1505.
[25] S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, and K.
Saenko, "Sequence to sequence-video to text," in Proc. Int. Conf. Computer Vision,
2015, pp. 4534-4542.
[26] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. "Show and tell: A neural
image caption generator," in Proc. Conf. Computer Vision and Pattern
Recognition, 2015, pp. 3156-3164.
[27] J. Johnson, A. Karpathy, and L. Fei-Fei, "Densecap: Fully convolutional localization networks for dense captioning," in Proc. IEEE Conf. Computer Vision and
Pattern Recognition (CVPR), 2015, pp. 4565-4574.
[28] Q. Wu, C. Shen, L. Liu, A. Dick, and A. v d. Hengel, "What value do explicit
high level concepts have in vision to language problems?" in Proc. Conf. Computer
Vision and Pattern Recognition, 2016, pp. 203-212.
[29] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and
Y. Bengio, "Show, attend and tell: Neural image caption generation with visual
attention," in Proc. Int. Conf. Machine Learning, 2015.
[30] Z. Yang, Y. Yuan, Y. Wu, R. Salakhutdinov, and W. W. Cohen, "Review networks for caption generation," in Proc. Conf. Neural Information Processing
Systems, 2016.
[31] Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo, "Image captioning with semantic
attention," in Proc. Conf. Computer Vision and Pattern Recognition, 2016, pp.
4651-4659.
[32] H. Yu, J. Wang, Z. Huang, Y. Yang, and W. Xu, "Video paragraph captioning
using hierarchical recurrent neural networks," in Proc. Conf. Computer Vision and
Pattern Recognition, 2016, pp. 4584-4593.
[33] K. Tran, X. He, L. Zhang, J. Sun, C. Carapcea, C. Thrasher, C. Buehler, and,
and C. Sienkiewicz, "Rich image captioning in the wild. Deep Vision Workshop," in
Proc. Conf. Computer Vision and Pattern Recognition, 2016, pp. 434-441.
[34] S. Wu, J. Wieland, O. Farivar, and J. Schiller. "Automatic Alt-text: Computergenerated image descriptions for blind users on a social network service," in Proc.
20th ACM Conf. Computer Supported Cooperative Work and Social Computing,
2017.

[49] P. Young, A. Lai, M. Hodosh, and J. Hockenmaier, "From image descriptions to
visual denotations: New similarity metrics for semantic inference over event descriptions," in Proc. Association Computational Linguistics, vol. 2, 2014, pp. 67-78.
[50] D. Elliott and F. Keller, "Comparing automatic evaluation measures for image
description," in Proc. 52nd Annu. Meeting Association Computational Linguistics,
2014, pp. 452-457.
[51] T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D.
Ramanan, C. Lawrence Zitnick, and P. Dollár, "Microsoft COCO: Common objects in
context," in Proc. European Conf. Computer Vision, 2015.
[52] Y. Cui, M. R. Ronchi, T.-Y. Lin, P. Dollár, L. Zitnick. (2015) COCO captioning
challenge. [Online]. Available: http://mscoco.org/dataset/#captions-challenge
[53] Microsoft Cognitive Services Computer Vision API. [Online]. Available: https://
www.microsoft.com/cognitive-services/en-us/computer-vision-api
[54] Z. Yang, X. He, J. Gao, L. Deng, and A. Smola, "Stacked attention networks for
image question answering," in Proc. Conf. Computer Vision and Pattern Recognition,
2016, pp. 21-29.
[55] A. Agrawal, J. Lu, S. Antol, M. Mitchell, L. Zitnick, D. Batra, and D. Parikh,
"VQA: Visual question answering," in Proc. Int. Conf. Computer Vision, 2015, pp.
2425-2433.
[56] A. Das, S. Kottur, K. Gupta, A. Singh, D. Yadav, J. M. F. Moura, D. Parikh, and
D. Batra, "Visual dialog," in Proc. Conf. Computer Vision and Pattern Recognition,
2017.
[57] H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, and D. Metaxas,
"StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial
networks," in Proc. Int. Conf. Computer Vision, 2017.
[58] T.-H. (K. ). Huang, F. Ferraro, N. Mostafazadeh, I. Misra, A. Agrawal, J. Devlin,
R. Girshick, X. He, P. Kohli, D. Batra, C. Lawrence Zitnick, D. Parikh, L.
Vanderwende, M. Galley, and M. Mitchell, "Visual storytelling," in Proc. 2016 Conf.
North American Chapter Association Computational Linguistics: Human Language
Technologies, 2016, pp. 1233-1239.
[59] G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Audio, Speech,
Lang. Process., vol. 20, pp. 30-42, Jan. 2012.

[35] C. Shallue. (2016). Open-source code on show and tell: A neural image caption
generator. [Online]. Available: https://github.com/tensorflow/models/tree/master/
im2txt

[60] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, and N. Jaitly, A, "Deep
neural networks for acoustic modeling in speech recognition," IEEE Signal Process.
Mag., vol. 29, pp. 82-97, Dec. 2012.

[36] L. Deng and D. Yu, Deep Learning: Methods and Applications, NOW
Publishers, 2014.

[61] K. Koenigsbauer, Microsoft Office Blogs. (2016). [Online]. Available: https://
blogs.office.com/2016/12/20/new-to-office-365-in-december-accessibility-updatesand-more/

[37] I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with
neural networks," in Proc. Conf. Neural Information Processing Systems, 2014.
[38] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly
learning to align and translate," in Proc. Int. Conf. Learning Representations,
2015.
[39] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Gated feedback recurrent neural networks," in Proc. Int. Conf. Machine Learning, 2015.
[40] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural
Comput., vol. 98, pp. 1735-1780 1997.
[41] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A
large-scale hierarchical image database," in Proc. Conf. Computer Vision and
Pattern Recognition, 2009, pp. 248-255.
[42] H. O. Song, R. Girshick, S. Jegelka, J. Mairal, Z. Harchaoui, and T. Darrell,
"On learning to localize objects with minimal supervision," in Proc. Int. Conf.
Machine Learning, 2014.
[43] C. Zhang, J. C. Platt, and P. A. Viola, "Multiple instance boosting for object
detection," in Proc. Conf. Neural Information Processing Systems, 2005.
[44] S. Banerjee and A. Lavie. "METEOR: An automatic metric for MT evaluation
with improved correlation with human judgments," in Proc. ACL Workshop on
Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or
Summarization, 2005.
[45] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "Bleu: a method for automatic evaluation of machine translation," in Proc. 40th Annu. Meeting Association
Computational Linguistics, 2002, pp. 311-318.
[46] R. Vedantam, L. Zitnick, and D. Parikh, "CIDEr: Consensus-based image
description evaluation," in Proc. European Conf. Computer Vision, 2015, pp.
4566-4575.
[47] P. Anderson, B. Fernando, M. Johnson, and S. Gould, "SPICE: Semantic propositional image caption evaluation," in Proc. European Conf. Computer Vision,
2016.

116

[48] C. Rashtchian, P. Young, M. Hodosh, and J. Hockenmaier, "Collecting
image annotations using Amazon's mechanical turk," in Proc. NAACL HLT
Workshop Creating Speech and Language Data with Amazon's Mechanical
Turk, 2010.

[62] R. R. Varior, B. Shuai, J. Lu, D. Xu, and G. Wang, "A Siamese long short-term
memory architecture for human re-identification," in Proc. European Conf. Computer
Vision, 2016.
[63] J. Liu, A. Shahroudy, D. Xu, and G. Wang, "Spatio-temporal LSTM with trust
gates for 3D human action recognition," in Proc. European Conf. Computer Vision,
2016.
[64] P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang,
"Bottom-up and top-down attention for image captioning and VQA," arXiv Preprint,
arXiv:1707.07998.
[65] Z. Ren, X. Wang, N. Zhang, X. Lv, and L.-J. Li, "Deep reinforcement learningbased image captioning with embedding reward," in Proc. Conf. Computer Vision and
Pattern Recognition, 2017.
[66] K. Lin, D. Li, X. He, Z. Zhang, and M.-T. Sun, "Adversarial ranking for language
generation," arXiv Preprint, arXiv:1705.11001
[67] S. J. Rennie, E. Marcheret, Y. Mroueh, J. Ross, and V. Goel, "Self-critical
Sequence Training for Image Captioning," in Proc. Conf. Computer Vision and
Pattern Recognition, 2017.
[68] L. Yu, W. Zhang, J. Wang, and Y. Yu, "SeqGAN: Sequence generative adversarial
nets with policy gradient," in Proc. Association Advancement Artificial Intelligence,
2017.
[69] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA:
MIT Press 2016.
[70] Q. Wu, D. Teney, P. Wang, C. Shen, A. Dick, and A. van den Hengel, "Visual
question answering: A survey of methods and data sets," in Computer Vision and
Image Understanding. Elsevier, 2017.
[71] Seeing AI. [Online]. Available: https://www.microsoft.com/en-us/seeing-ai/
[72] S. Reed, Z. Akata, X. Yan, L. Logeswaran, H. Lee, and B. Schiel "Generative
adversarial text to image synthesis," in Proc. Int. Conf. Machine Learning, 2016.

SP



IEEE SIGNAL PROCESSING MAGAZINE

|

November 2017

|


http://www.mscoco.org/dataset/#captions-challenge http://https:// http://www.microsoft.com/cognitive-services/en-us/computer-vision-api https://www.github.com/tensorflow/models/tree/master/ http://https:// http://blogs.office.com/2016/12/20/new-to-office-365-in-december-accessibility-updates https://www.microsoft.com/en-us/seeing-ai/

Table of Contents for the Digital Edition of Signal Processing - November 2017

Signal Processing - November 2017 - Cover1
Signal Processing - November 2017 - Cover2
Signal Processing - November 2017 - 1
Signal Processing - November 2017 - 2
Signal Processing - November 2017 - 3
Signal Processing - November 2017 - 4
Signal Processing - November 2017 - 5
Signal Processing - November 2017 - 6
Signal Processing - November 2017 - 7
Signal Processing - November 2017 - 8
Signal Processing - November 2017 - 9
Signal Processing - November 2017 - 10
Signal Processing - November 2017 - 11
Signal Processing - November 2017 - 12
Signal Processing - November 2017 - 13
Signal Processing - November 2017 - 14
Signal Processing - November 2017 - 15
Signal Processing - November 2017 - 16
Signal Processing - November 2017 - 17
Signal Processing - November 2017 - 18
Signal Processing - November 2017 - 19
Signal Processing - November 2017 - 20
Signal Processing - November 2017 - 21
Signal Processing - November 2017 - 22
Signal Processing - November 2017 - 23
Signal Processing - November 2017 - 24
Signal Processing - November 2017 - 25
Signal Processing - November 2017 - 26
Signal Processing - November 2017 - 27
Signal Processing - November 2017 - 28
Signal Processing - November 2017 - 29
Signal Processing - November 2017 - 30
Signal Processing - November 2017 - 31
Signal Processing - November 2017 - 32
Signal Processing - November 2017 - 33
Signal Processing - November 2017 - 34
Signal Processing - November 2017 - 35
Signal Processing - November 2017 - 36
Signal Processing - November 2017 - 37
Signal Processing - November 2017 - 38
Signal Processing - November 2017 - 39
Signal Processing - November 2017 - 40
Signal Processing - November 2017 - 41
Signal Processing - November 2017 - 42
Signal Processing - November 2017 - 43
Signal Processing - November 2017 - 44
Signal Processing - November 2017 - 45
Signal Processing - November 2017 - 46
Signal Processing - November 2017 - 47
Signal Processing - November 2017 - 48
Signal Processing - November 2017 - 49
Signal Processing - November 2017 - 50
Signal Processing - November 2017 - 51
Signal Processing - November 2017 - 52
Signal Processing - November 2017 - 53
Signal Processing - November 2017 - 54
Signal Processing - November 2017 - 55
Signal Processing - November 2017 - 56
Signal Processing - November 2017 - 57
Signal Processing - November 2017 - 58
Signal Processing - November 2017 - 59
Signal Processing - November 2017 - 60
Signal Processing - November 2017 - 61
Signal Processing - November 2017 - 62
Signal Processing - November 2017 - 63
Signal Processing - November 2017 - 64
Signal Processing - November 2017 - 65
Signal Processing - November 2017 - 66
Signal Processing - November 2017 - 67
Signal Processing - November 2017 - 68
Signal Processing - November 2017 - 69
Signal Processing - November 2017 - 70
Signal Processing - November 2017 - 71
Signal Processing - November 2017 - 72
Signal Processing - November 2017 - 73
Signal Processing - November 2017 - 74
Signal Processing - November 2017 - 75
Signal Processing - November 2017 - 76
Signal Processing - November 2017 - 77
Signal Processing - November 2017 - 78
Signal Processing - November 2017 - 79
Signal Processing - November 2017 - 80
Signal Processing - November 2017 - 81
Signal Processing - November 2017 - 82
Signal Processing - November 2017 - 83
Signal Processing - November 2017 - 84
Signal Processing - November 2017 - 85
Signal Processing - November 2017 - 86
Signal Processing - November 2017 - 87
Signal Processing - November 2017 - 88
Signal Processing - November 2017 - 89
Signal Processing - November 2017 - 90
Signal Processing - November 2017 - 91
Signal Processing - November 2017 - 92
Signal Processing - November 2017 - 93
Signal Processing - November 2017 - 94
Signal Processing - November 2017 - 95
Signal Processing - November 2017 - 96
Signal Processing - November 2017 - 97
Signal Processing - November 2017 - 98
Signal Processing - November 2017 - 99
Signal Processing - November 2017 - 100
Signal Processing - November 2017 - 101
Signal Processing - November 2017 - 102
Signal Processing - November 2017 - 103
Signal Processing - November 2017 - 104
Signal Processing - November 2017 - 105
Signal Processing - November 2017 - 106
Signal Processing - November 2017 - 107
Signal Processing - November 2017 - 108
Signal Processing - November 2017 - 109
Signal Processing - November 2017 - 110
Signal Processing - November 2017 - 111
Signal Processing - November 2017 - 112
Signal Processing - November 2017 - 113
Signal Processing - November 2017 - 114
Signal Processing - November 2017 - 115
Signal Processing - November 2017 - 116
Signal Processing - November 2017 - 117
Signal Processing - November 2017 - 118
Signal Processing - November 2017 - 119
Signal Processing - November 2017 - 120
Signal Processing - November 2017 - 121
Signal Processing - November 2017 - 122
Signal Processing - November 2017 - 123
Signal Processing - November 2017 - 124
Signal Processing - November 2017 - 125
Signal Processing - November 2017 - 126
Signal Processing - November 2017 - 127
Signal Processing - November 2017 - 128
Signal Processing - November 2017 - 129
Signal Processing - November 2017 - 130
Signal Processing - November 2017 - 131
Signal Processing - November 2017 - 132
Signal Processing - November 2017 - 133
Signal Processing - November 2017 - 134
Signal Processing - November 2017 - 135
Signal Processing - November 2017 - 136
Signal Processing - November 2017 - 137
Signal Processing - November 2017 - 138
Signal Processing - November 2017 - 139
Signal Processing - November 2017 - 140
Signal Processing - November 2017 - 141
Signal Processing - November 2017 - 142
Signal Processing - November 2017 - 143
Signal Processing - November 2017 - 144
Signal Processing - November 2017 - 145
Signal Processing - November 2017 - 146
Signal Processing - November 2017 - 147
Signal Processing - November 2017 - 148
Signal Processing - November 2017 - 149
Signal Processing - November 2017 - 150
Signal Processing - November 2017 - 151
Signal Processing - November 2017 - 152
Signal Processing - November 2017 - 153
Signal Processing - November 2017 - 154
Signal Processing - November 2017 - 155
Signal Processing - November 2017 - 156
Signal Processing - November 2017 - 157
Signal Processing - November 2017 - 158
Signal Processing - November 2017 - 159
Signal Processing - November 2017 - 160
Signal Processing - November 2017 - 161
Signal Processing - November 2017 - 162
Signal Processing - November 2017 - 163
Signal Processing - November 2017 - 164
Signal Processing - November 2017 - 165
Signal Processing - November 2017 - 166
Signal Processing - November 2017 - 167
Signal Processing - November 2017 - 168
Signal Processing - November 2017 - 169
Signal Processing - November 2017 - 170
Signal Processing - November 2017 - 171
Signal Processing - November 2017 - 172
Signal Processing - November 2017 - 173
Signal Processing - November 2017 - 174
Signal Processing - November 2017 - 175
Signal Processing - November 2017 - 176
Signal Processing - November 2017 - Cover3
Signal Processing - November 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com