楚航、罗若天作者

ArXiv Weekly Radiostation:本周NLP、CV、ML精选论文30篇(4.12-4.18)

机器之心联合由楚航、罗若天发起的ArXiv Weekly Radiostation,精选每周NLP、CV、ML领域各10篇重要论文,本周详情如下: 

ArXiv Weekly: 10 NLP Papers You May Want to Read

[NLP paper 1/10]

Why you may want to read this: Newest paper from Philip S. Yu (Professor of Computer Science, University of Illinons at Chicago).

SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction.

Xuming Hu, Lijie Wen, Yusong Xu, Chenwei Zhang, Philip S. Yu

Open relation extraction is the task of extracting open-domain relation facts from natural language sentences. Existing works either utilize heuristics or distant-supervised annotations to train a supervised classifier over pre-defined relations, or adopt unsupervised methods with additional assumptions that have less discriminative power. In this work, we proposed a self-supervised framework named SelfORE, which exploits weak, self-supervised signals by leveraging large pretrained language model for adaptive clustering on contextualized relational features, and bootstraps the self-supervised signals by improving contextualized features in relation classification. Experimental results on three datasets show the effectiveness and robustness of SelfORE on open-domain Relation Extraction when comparing with competitive baselines. Source code is available at https://github.com/THU-BPM/SelfORE.


[NLP paper 2/10]

Why you may want to read this: Newest paper from Tomas Mikolov (Senior Researcher, CIIRC CTU).

Class-Agnostic Continual Learning of Alternating Languages and Domains.

Germán Kruszewski, Ionut-Teodor Sorodoc, Tomas Mikolov

Continual Learning has been often framed as the problem of training a model in a sequence of tasks. In this regard, Neural Networks have been attested to forget the solutions to previous task as they learn new ones. Yet, modelling human life-long learning does not necessarily require any crisp notion of tasks. In this work, we propose a benchmark based on language modelling in a multilingual and multidomain setting that prescinds of any explicit delimitation of training examples into distinct tasks, and propose metrics to study continual learning and catastrophic forgetting in this setting. Then, we introduce a simple Product of Experts learning system that performs strongly on this problem while displaying interesting properties, and investigate its merits for avoiding forgetting.


[NLP paper 3/10]

Why you may want to read this: Newest paper from Deng Cai (Professor of Computer Science, Zhejiang University), Simon Baker (Distinguished Engineer, nVidia Corporation).

Stylistic Dialogue Generation via Information-Guided Reinforcement Learning Strategy.

Yixuan Su, Deng Cai, Yan Wang, Simon Baker, Anna Korhonen, Nigel Collier, Xiaojiang Liu

Stylistic response generation is crucial for building an engaging dialogue system for industrial use. While it has attracted much research interest, existing methods often generate stylistic responses at the cost of the content quality (relevance and fluency). To enable better balance between the content quality and the style, we introduce a new training strategy, know as Information-Guided Reinforcement Learning (IG-RL). In IG-RL, a training model is encouraged to explore stylistic expressions while being constrained to maintain its content quality. This is achieved by adopting reinforcement learning strategy with statistical style information guidance for quality-preserving explorations. Experiments on two datasets show that the proposed approach outperforms several strong baselines in terms of the overall response performance.


[NLP paper 4/10]

Why you may want to read this: Newest paper from Simon Baker (Distinguished Engineer, nVidia Corporation), Deng Cai (Professor of Computer Science, Zhejiang University).

Prototype-to-Style: Dialogue Generation with Style-Aware Editing on Retrieval Memory.

Yixuan Su, Yan Wang, Simon Baker, Deng Cai, Xiaojiang Liu, Anna Korhonen, Nigel Collier

The ability of a dialog system to express prespecified language style during conversations has a direct, positive impact on its usability and on user satisfaction. We introduce a new prototype-to-style (PS) framework to tackle the challenge of stylistic dialogue generation. The framework uses an Information Retrieval (IR) system and extracts a response prototype from the retrieved response. A stylistic response generator then takes the prototype and the desired language style as model input to obtain a high-quality and stylistic response. To effectively train the proposed model, we propose a new style-aware learning objective as well as a de-noising learning strategy. Results on three benchmark datasets from two languages demonstrate that the proposed approach significantly outperforms existing baselines in both in-domain and cross-domain evaluation


[NLP paper 5/10]

Why you may want to read this: Newest paper from Yan Zhang (University of  South Carolina).

Satirical News Detection with Semantic Feature Extraction and Game-theoretic Rough Sets.

Yue Zhou, Yan Zhang, JingTao Yao

Satirical news detection is an important yet challenging task to prevent spread of misinformation. Many feature based and end-to-end neural nets based satirical news detection systems have been proposed and delivered promising results. Existing approaches explore comprehensive word features from satirical news articles, but lack semantic metrics using word vectors for tweet form satirical news. Moreover, the vagueness of satire and news parody determines that a news tweet can hardly be classified with a binary decision, that is, satirical or legitimate. To address these issues, we collect satirical and legitimate news tweets, and propose a semantic feature based approach. Features are extracted by exploring inconsistencies in phrases, entities, and between main and relative clauses. We apply game-theoretic rough set model to detect satirical news, in which probabilistic thresholds are derived by game equilibrium and repetition learning mechanism. Experimental results on the collected dataset show the robustness and improvement of the proposed approach compared with Pawlak rough set model and SVM.


[NLP paper 6/10]

Why you may want to read this: Newest paper from Yan Zhang (University of  South Carolina).

Improving BERT with Self-Supervised Attention.

Xiaoyu Kou, Yaming Yang, Yujing Wang, Ce Zhang, Yiren Chen, Yunhai Tong, Yan Zhang, Jing Bai

One of the most popular paradigms of applying large, pre-trained NLP models such as BERT is to fine-tune it on a smaller dataset. However, one challenge remains as the fine-tuned model often overfits on smaller datasets. A symptom of this phenomenon is that irrelevant words in the sentences, even when they are obvious to humans, can substantially degrade the performance of these fine-tuned BERT models. In this paper, we propose a novel technique, called Self-Supervised Attention (SSA) to help facilitate this generalization challenge. Specifically, SSA automatically generates weak, token-level attention labels iteratively by "probing" the fine-tuned model from the previous iteration. We investigate two different ways of integrating SSA into BERT and propose a hybrid approach to combine their benefits. Empirically, on a variety of public datasets, we illustrate significant performance improvement using our SSA-enhanced BERT model.


 [NLP paper 7/10]

Why you may want to read this: Newest paper from Jimmy Lin (University of Waterloo), Wen Gao (Professor of Computer Science, Peking University), Ming Li (University Professor, University of Waterloo).

Semantics of the Unwritten.

He Bai, Peng Shi, Jimmy Lin, Luchen Tan, Kun Xiong, Wen Gao, Jie Liu, Ming Li

The semantics of a text is manifested not only by what is read, but also by what is not read. In this article, we will study how those implicit "not read" information such as end-of-paragraph (EOP) and end-of-sequence (EOS) affect the quality of text generation. Transformer-based pretrained language models (LMs) have demonstrated the ability to generate long continuations with good quality. This model gives us a platform for the first time to demonstrate that paragraph layouts and text endings are also important components of human writing. Specifically, we find that pretrained LMs can generate better continuations by learning to generate the end of the paragraph (EOP) in the fine-tuning stage. Experimental results on English story generation show that EOP can lead to higher BLEU score and lower EOS perplexity. To further investigate the relationship between text ending and EOP, we conduct experiments with a self-collected Chinese essay dataset on Chinese-GPT2, a character level LM without paragraph breaker or EOS during pre-training. Experimental results show that the Chinese GPT2 can generate better essay endings with paragraph information. Experiments on both English stories and Chinese essays demonstrate that learning to end paragraphs can benefit the continuation generation with pretrained LMs.


[NLP paper 8/10]

Why you may want to read this: Newest paper from Kyunghyun Cho (New York University, Facebook AI Research).

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries.

Alex Wang, Kyunghyun Cho, Mike Lewis

Practical applications of abstractive summarization models are limited by frequent factual inconsistencies with respect to their input. Existing automatic evaluation metrics for summarization are largely insensitive to such errors. We propose an automatic evaluation protocol called QAGS (pronounced "kags") that is designed to identify factual inconsistencies in a generated summary. QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source. To evaluate QAGS, we collect human judgments of factual consistency on model-generated summaries for the CNN/DailyMail (Hermann et al., 2015) and XSUM (Narayan et al., 2018) summarization datasets. QAGS has substantially higher correlations with these judgments than other automatic evaluation metrics. Also, QAGS offers a natural form of interpretability: The answers and questions generated while computing QAGS indicate which tokens of a summary are inconsistent and why. We believe QAGS is a promising tool in automatically generating usable and factually consistent text.


[NLP paper 9/10]

Why you may want to read this: Newest paper from Alex Waibel (Carnegie Mellon, KIT, Karlsruhe Institute of Technology, University of Karlsruhe).

Error-correction and extraction in request dialogs.

Stefan Constantin, Alex Waibel

We propose a component that gets a request and a correction and outputs a corrected request. To get this corrected request, the entities in the correction phrase replace their corresponding entities in the request. In addition, the proposed component outputs these pairs of corresponding reparandum and repair entity. These entity pairs can be used, for example, for learning in a life-long learning component of a dialog system to reduce the need for correction in future dialogs. For the approach described in this work, we fine-tune BERT for sequence labeling. We created a dataset to evaluate our component; for which we got an accuracy of 93.28 %. An accuracy of 88.58 % has been achieved for out-of-domain data. This accuracy shows that the proposed component is learning the concept of corrections and can be developed to be used as an upstream component to avoid the need for collecting data for request corrections for every new domain.


[NLP paper 10/10]

Why you may want to read this: Newest paper from Yiming Yang (Professor of Computer Science, Carnegie Mellon University).

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices.

Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, Denny Zhou

Natural Language Processing (NLP) has recently achieved great success by using huge pre-trained models with hundreds of millions of parameters. However, these models suffer from heavy model sizes and high latency such that they cannot be deployed to resource-limited mobile devices. In this paper, we propose MobileBERT for compressing and accelerating the popular BERT model. Like the original BERT, MobileBERT is task-agnostic, that is, it can be generically applied to various downstream NLP tasks via simple fine-tuning. Basically, MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks. To train MobileBERT, we first train a specially designed teacher model, an inverted-bottleneck incorporated BERT_LARGE model. Then, we conduct knowledge transfer from this teacher to MobileBERT. Empirical studies show that MobileBERT is 4.3x smaller and 5.5x faster than BERT_BASE while achieving competitive results on well-known benchmarks. On the natural language inference tasks of GLUE, MobileBERT achieves a GLUEscore o 77.7 (0.6 lower than BERT_BASE), and 62 ms latency on a Pixel 4 phone. On the SQuAD v1.1/v2.0 question answering task, MobileBERT achieves a dev F1 score of 90.0/79.2 (1.5/2.1 higher than BERT_BASE).

ArXiv Weekly: 10 CV Papers You May Want to Read

[CV paper 1/10]

Why you may want to read this: Newest paper from Anil K. Jain (Michigan State University).

Fingerprint Presentation Attack Detection: A Sensor and Material Agnostic Approach.

Steven A. Grosz, Tarang Chugh, Anil K. Jain

The vulnerability of automated fingerprint recognition systems to presentation attacks (PA), i.e., spoof or altered fingers, has been a growing concern, warranting the development of accurate and efficient presentation attack detection (PAD) methods. However, one major limitation of the existing PAD solutions is their poor generalization to new PA materials and fingerprint sensors, not used in training. In this study, we propose a robust PAD solution with improved cross-material and cross-sensor generalization. Specifically, we build on top of any CNN-based architecture trained for fingerprint spoof detection combined with cross-material spoof generalization using a style transfer network wrapper. We also incorporate adversarial representation learning (ARL) in deep neural networks (DNN) to learn sensor and material invariant representations for PAD. Experimental results on LivDet 2015 and 2017 public domain datasets exhibit the effectiveness of the proposed approach.


CV paper 2/10

Why you may want to read this: Newest paper from Jitendra Malik (Professor of EECS, UC Berkeley).

It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction.

Karttikeya Mangalam, Harshayu Girase, Shreyas Agarwal, Kuan-Hui Lee, Ehsan Adeli, Jitendra Malik, Adrien Gaidon

Human trajectory forecasting with multiple socially interacting agents is of critical importance for autonomous navigation in human environments, e.g., for self-driving cars and social robots. In this work, we present Predicted Endpoint Conditioned Network (PECNet) for flexible human trajectory prediction. PECNet infers distant trajectory endpoints to assist in long-range multi-modal trajectory prediction. A novel non-local social pooling layer enables PECNet to infer diverse yet socially compliant trajectories. Additionally, we present a simple "truncation-trick" for improving few-shot multi-modal trajectory prediction performance. We show that PECNet improves state-of-the-art performance on the Stanford Drone trajectory prediction benchmark by ~19.5% and on the ETH/UCY benchmark by ~40.8%.


CV paper 3/10

Why you may want to read this: Newest paper from Jitendra Malik (Professor of EECS, UC Berkeley).

Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood Estimation.

Ke Li, Shichong Peng, Tianhao Zhang, Jitendra Malik

Many tasks in computer vision and graphics fall within the framework of conditional image synthesis. In recent years, generative adversarial nets (GANs) have delivered impressive advances in quality of synthesized images. However, it remains a challenge to generate both diverse and plausible images for the same input, due to the problem of mode collapse. In this paper, we develop a new generic multimodal conditional image synthesis method based on Implicit Maximum Likelihood Estimation (IMLE) and demonstrate improved multimodal image synthesis performance on two tasks, single image super-resolution and image synthesis from scene layouts. We make our implementation publicly available.


CV paper 4/10

Why you may want to read this: Newest paper from Jitendra Malik (Professor of EECS, UC Berkeley), Larry Davis (Professor of Computer Science, University of Maryland).

Inclusive GAN: Improving Data and Minority Coverage in Generative Models.

Ning Yu, Ke Li, Peng Zhou, Jitendra Malik, Larry Davis, Mario Fritz

Generative Adversarial Networks (GANs) have brought about rapid progress towards generating photorealistic images. Yet the equitable allocation of their modeling capacity among subgroups has received less attention, which could lead to potential biases against underrepresented minorities if left uncontrolled. In this work, we first formalize the problem of minority inclusion as one of data coverage, and then propose to improve data coverage by harmonizing adversarial training with reconstructive generation. The experiments show that our method outperforms the existing state-of-the-art methods in terms of data coverage on both seen and unseen data. We develop an extension that allows explicit control over the minority subgroups that the model should ensure to include, and validate its effectiveness at little compromise from the overall performance on the entire dataset. Code, models, and supplemental videos are available at GitHub.


CV paper 5/10

Why you may want to read this: Newest paper from Xiangyu Zhang (Research Leader, Megvii Technology), Jian Sun (Chief Scientist | Managing Director of Research, Megvii (Face++)), Jiaya Jia (Distinguished Scientist, Tencent; Professor, CUHK).

Attentive Normalization for Conditional Image Generation.

Yi Wang, Ying-Cong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia

Traditional convolution-based generative adversarial networks synthesize images based on hierarchical local operations, where long-range dependency relation is implicitly modeled with a Markov chain. It is still not sufficient for categories with complicated structures. In this paper, we characterize long-range dependence with attentive normalization (AN), which is an extension to traditional instance normalization. Specifically, the input feature map is softly divided into several regions based on its internal semantic similarity, which are respectively normalized. It enhances consistency between distant regions with semantic correspondence. Compared with self-attention GAN, our attentive normalization does not need to measure the correlation of all locations, and thus can be directly applied to large-size feature maps without much computational burden. Extensive experiments on class-conditional image generation and semantic inpainting verify the efficacy of our proposed module.


CV paper 6/10

Why you may want to read this: Newest paper from Guillermo Sapiro (Duke University).

Differential 3D Facial Recognition: Adding 3D to Your State-of-the-Art 2D Method.

J. Matias Di Martino, Fernando Suzacq, Mauricio Delbracio, Qiang Qiu, Guillermo Sapiro

Active illumination is a prominent complement to enhance 2D face recognition and make it more robust, e.g., to spoofing attacks and low-light conditions. In the present work we show that it is possible to adopt active illumination to enhance state-of-the-art 2D face recognition approaches with 3D features, while bypassing the complicated task of 3D reconstruction. The key idea is to project over the test face a high spatial frequency pattern, which allows us to simultaneously recover real 3D information plus a standard 2D facial image. Therefore, state-of-the-art 2D face recognition solution can be transparently applied, while from the high frequency component of the input image, complementary 3D facial features are extracted. Experimental results on ND-2006 dataset show that the proposed ideas can significantly boost face recognition performance and dramatically improve the robustness to spoofing attacks.


CV paper 7/10

Why you may want to read this: Newest paper from Leonidas Guibas (Professor of Computer Science, Stanford University).

Deformation-Aware 3D Model Embedding and Retrieval.

Mikaela Angelina Uy, Jingwei Huang, Minhyuk Sung, Tolga Birdal, Leonidas Guibas

We introduce a new problem of \textit{retrieving} 3D models that are not just similar but are deformable to a given query shape. We then present a novel deep \textit{deformation-aware} embedding to solve this retrieval task. 3D model retrieval is a fundamental operation for recovering a clean and complete 3D model from a noisy and partial 3D scan. However, given a finite collection of 3D shapes, even the closest model to a query may not be a satisfactory reconstruction. This motivates us to apply 3D model deformation techniques to adapt the retrieved model so as to better fit the query. Yet, certain restrictions are enforced in most 3D deformation techniques to preserve important features of the original model that prevent a perfect fitting of the deformed model to the query. This gap between the deformed model and the query induces \textit{asymmetric} relationships among the models, which cannot be dealt with typical metric learning techniques. Thus, to retrieve the best models for fitting, we propose a novel deep embedding approach that learns the asymmetric relationships by leveraging location-dependent egocentric distance fields. We also propose two strategies for training the embedding network. We demonstrate that both of these approaches outperform other baselines in both synthetic evaluations and real 3D object reconstruction.


[CV paper 8/10]

Why you may want to read this: Newest paper from Serge Belongie (Professor of Computer Science, Cornell University and Cornell Tech), Kilian Q. Weinberger (Associate Professor of Computer Science, Cornell University, ASAPP Research).

End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection.

Rui Qian, Divyansh Garg, Yan Wang, Yurong You, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, Wei-Lun Chao

Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks -- yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e.


[CV paper 9/10]

Why you may want to read this: Newest paper from Michael Jones (Researcher, MERL).

LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood.

Abhinav Kumar, Tim K. Marks, Wenxuan Mou, Ye Wang, Michael Jones, Anoop Cherian, Toshiaki Koike-Akino, Xiaoming Liu, Chen Feng

Modern face alignment methods have become quite accurate at predicting the locations of facial landmarks, but they do not typically estimate the uncertainty of their predicted locations nor predict whether landmarks are visible. In this paper, we present a novel framework for jointly predicting landmark locations, associated uncertainties of these predicted locations, and landmark visibilities. We model these as mixed random variables and estimate them using a deep network trained with our proposed Location, Uncertainty, and Visibility Likelihood (LUVLi) loss. In addition, we release an entirely new labeling of a large face alignment dataset with over 19,000 face images in a full range of head poses. Each face is manually labeled with the ground-truth locations of 68 landmarks, with the additional information of whether each landmark is unoccluded, self-occluded (due to extreme head poses), or externally occluded. Not only does our joint estimation yield accurate estimates of the uncertainty of predicted landmark locations, but it also yields state-of-the-art estimates for the landmark locations themselves on multiple standard face alignment datasets. Our method's estimates of the uncertainty of predicted landmark locations could be used to automatically identify input images on which face alignment fails, which can be critical for downstream tasks.


 [CV paper 10/10]

Why you may want to read this: Newest paper from Alan Yuille (Professor of Cognitive Science and Computer Science, Johns Hopkins University).

Context-Aware Group Captioning via Self-Attention and Contrastive Features.

Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

While image captioning has progressed rapidly, existing works focus mainly on describing single images. In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images. Context-aware group captioning requires not only summarizing information from both the target and reference image group but also contrasting between them. To solve this problem, we propose a framework combining self-attention mechanism with contrastive feature construction to effectively summarize common information from each image group while capturing discriminative information between them. To build the dataset for this task, we propose to group the images and generate the group captions based on single image captions using scene graphs matching. Our datasets are constructed on top of the public Conceptual Captions dataset and our new Stock Captions dataset. Experiments on the two datasets show the effectiveness of our method on this new task. Related Datasets and code are released at https://lizw14.github.io/project/groupcap .

ArXiv Weekly: 10 ML Papers You May Want to Read

[ML paper 1/10]

Why you may want to read this: Newest paper from Eibe Frank (Professor, Department of Computer Science, University of Waikato).

Embedding Java Classes with code2vec: Improvements from Variable Obfuscation.

Rhys Compton, Eibe Frank, Panos Patros, Abigail Koay

Automatic source code analysis in key areas of software engineering, such as code security, can benefit from Machine Learning (ML). However, many standard ML approaches require a numeric representation of data and cannot be applied directly to source code. Thus, to enable ML, we need to embed source code into numeric feature vectors while maintaining the semantics of the code as much as possible. code2vec is a recently released embedding approach that uses the proxy task of method name prediction to map Java methods to feature vectors. However, experimentation with code2vec shows that it learns to rely on variable names for prediction, causing it to be easily fooled by typos or adversarial attacks. Moreover, it is only able to embed individual Java methods and cannot embed an entire collection of methods such as those present in a typical Java class, making it difficult to perform predictions at the class level (e.g., for the identification of malicious Java classes). Both shortcomings are addressed in the research presented in this paper. We investigate the effect of obfuscating variable names during the training of a code2vec model to force it to rely on the structure of the code rather than specific names and consider a simple approach to creating class-level embeddings by aggregating sets of method embeddings. Our results, obtained on a challenging new collection of source-code classification problems, indicate that obfuscating variable names produces an embedding model that is both impervious to variable naming and more accurately reflects code semantics. The datasets, models, and code are shared for further ML research on source code.


 [ML paper 2/10]

Why you may want to read this: Newest paper from Witold Pedrycz ().

Granular Computing: An Augmented Scheme of Degranulation Through a Modified Partition Matrix.

Kaijie Xu, Witold Pedrycz, Zhiwu Li, Mengdao Xing

As an important technology in artificial intelligence Granular Computing (GrC) has emerged as a new multi-disciplinary paradigm and received much attention in recent years. Information granules forming an abstract and efficient characterization of large volumes of numeric data have been considered as the fundamental constructs of GrC. By generating prototypes and partition matrix, fuzzy clustering is a commonly encountered way of information granulation. Degranulation involves data reconstruction completed on a basis of the granular representatives. Previous studies have shown that there is a relationship between the reconstruction error and the performance of the granulation process. Typically, the lower the degranulation error is, the better performance of granulation is. However, the existing methods of degranulation usually cannot restore the original numeric data, which is one of the important reasons behind the occurrence of the reconstruction error. To enhance the quality of degranulation, in this study, we develop an augmented scheme through modifying the partition matrix. By proposing the augmented scheme, we dwell on a novel collection of granulation-degranulation mechanisms. In the constructed approach, the prototypes can be expressed as the product of the dataset matrix and the partition matrix. Then, in the degranulation process, the reconstructed numeric data can be decomposed into the product of the partition matrix and the matrix of prototypes. Both the granulation and degranulation are regarded as generalized rotation between the data subspace and the prototype subspace with the partition matrix and the fuzzification factor. By modifying the partition matrix, the new partition matrix is constructed through a series of matrix operations. We offer a thorough analysis of the developed scheme. The experimental results are in agreement with the underlying conceptual framework


[ML paper 3/10]

Why you may want to read this: Newest paper from Eric Xing (Professor of Machine Learning, Language Technology, Computer Science, Cargenie Mellon …), Tom Mitchell (E. Fredkin University Professor of Machine Learning, Carnegie Mellon University).

Learning from Imperfect Annotations.

Emmanouil Antonios Platanios, Maruan Al-Shedivat, Eric Xing, Tom Mitchell

Many machine learning systems today are trained on large amounts of human-annotated data. Data annotation tasks that require a high level of competency make data acquisition expensive, while the resulting labels are often subjective, inconsistent, and may contain a variety of human biases. To improve the data quality, practitioners often need to collect multiple annotations per example and aggregate them before training models. Such a multi-stage approach results in redundant annotations and may often produce imperfect "ground truth" that may limit the potential of training accurate machine learning models. We propose a new end-to-end framework that enables us to: (i) merge the aggregation step with model training, thus allowing deep learning systems to learn to predict ground truth estimates directly from the available data, and (ii) model difficulties of examples and learn representations of the annotators that allow us to estimate and take into account their competencies. Our approach is general and has many applications, including training more accurate models on crowdsourced data, ensemble learning, as well as classifier accuracy estimation from unlabeled data. We conduct an extensive experimental evaluation of our method on 5 crowdsourcing datasets of varied difficulty and show accuracy gains of up to 25% over the current state-of-the-art approaches for aggregating annotations, as well as significant reductions in the required annotation redundancy.


[ML paper 4/10]

Why you may want to read this: Newest paper from Karen Simonyan (Google DeepMind), Quoc V. Le (Research Scientist, Google Brain).

Evolving Normalization-Activation Layers.

Hanxiao Liu, Andrew Brock, Karen Simonyan, Quoc V. Le

Normalization layers and activation functions are critical components in deep neural networks that frequently co-locate with each other. Instead of designing them separately, we unify them into a single computation graph, and evolve its structure starting from low-level primitives. Our layer search algorithm leads to the discovery of EvoNorms, a set of new normalization-activation layers that go beyond existing design patterns. Several of these layers enjoy the property of being independent from the batch statistics. Our experiments show that EvoNorms not only excel on a variety of image classification models including ResNets, MobileNets and EfficientNets, but also transfer well to Mask R-CNN for instance segmentation and BigGAN for image synthesis, outperforming BatchNorm and GroupNorm based layers by a significant margin in many cases.


[ML paper 5/10]

Why you may want to read this: Newest paper from Huan Liu (Professor of Computer Science and Engineering, Arizona State University).

Leveraging Multi-Source Weak Social Supervision for Early Detection of Fake News.

Kai Shu, Guoqing Zheng, Yichuan Li, Subhabrata Mukherjee, Ahmed Hassan Awadallah, Scott Ruston, Huan Liu

Social media has greatly enabled people to participate in online activities at an unprecedented rate. However, this unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation. Given the rapidly evolving nature of news events and the limited amount of annotated data, state-of-the-art systems on fake news detection face challenges due to the lack of large numbers of annotated training instances that are hard to come by for early detection. In this work, we exploit multiple weak signals from different sources given by user and content engagements (referred to as weak social supervision), and their complementary utilities to detect fake news. We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances. Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.


 [ML paper 6/10]

Why you may want to read this: Newest paper from Fernando Pereira (VP and Engineering Fellow, Google), William W. Cohen (Google AI).

Guessing What's Plausible But Remembering What's True: Accurate Neural Reasoning for Question-Answering.

Haitian Sun, Andrew O. Arnold, Tania Bedrax-Weiss, Fernando Pereira, William W. Cohen

Neural approaches to natural language processing (NLP) often fail at the logical reasoning needed for deeper language understanding. In particular, neural approaches to reasoning that rely on embedded \emph{generalizations} of a knowledge base (KB) implicitly model which facts that are \emph{plausible}, but may not model which facts are \emph{true}, according to the KB. While generalizing the facts in a KB is useful for KB completion, the inability to distinguish between plausible inferences and logically entailed conclusions can be problematic in settings like as KB question answering (KBQA). We propose here a novel KB embedding scheme that supports generalization, but also allows accurate logical reasoning with a KB. Our approach introduces two new mechanisms for KB reasoning: neural retrieval over a set of embedded triples, and "memorization" of highly specific information with a compact sketch structure. Experimentally, this leads to substantial improvements over the state-of-the-art on two KBQA benchmarks.


[ML paper 7/10]

Why you may want to read this: Newest paper from Bernt Schiele (Professor, Max Planck Institute for Informatics, Saarland Informatics Campus, Saarland …), Stefan Roth (Professor of Computer Science, TU Darmstadt).

Normalizing Flows with Multi-Scale Autoregressive Priors.

Shweta Mahajan, Apratim Bhattacharyya, Mario Fritz, Bernt Schiele, Stefan Roth

Flow-based generative models are an important class of exact inference models that admit efficient inference and sampling for image synthesis. Owing to the efficiency constraints on the design of the flow layers, e.g. split coupling flow layers in which approximately half the pixels do not undergo further transformations, they have limited expressiveness for modeling long-range data dependencies compared to autoregressive models that rely on conditional pixel-wise generation. In this work, we improve the representational power of flow-based models by introducing channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR). Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data. The resulting model achieves state-of-the-art density estimation results on MNIST, CIFAR-10, and ImageNet. Furthermore, we show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.


[ML paper 8/10]

Why you may want to read this: Newest paper from Luca Benini (ETH Zürich, Università di Bologna).

pAElla: Edge-AI based Real-Time Malware Detection in Data Centers.

Antonio Libri, Andrea Bartolini, Luca Benini

The increasing use of Internet-of-Things (IoT) devices for monitoring a wide spectrum of applications, along with the challenges of "big data" streaming support they often require for data analysis, is nowadays pushing for an increased attention to the emerging edge computing paradigm. In particular, smart approaches to manage and analyze data directly on the network edge, are more and more investigated, and Artificial Intelligence (AI) powered edge computing is envisaged to be a promising direction. In this paper, we focus on Data Centers (DCs) and Supercomputers (SCs), where a new generation of high-resolution monitoring systems is being deployed, opening new opportunities for analysis like anomaly detection and security, but introducing new challenges for handling the vast amount of data it produces. In detail, we report on a novel lightweight and scalable approach to increase the security of DCs/SCs, that involves AI-powered edge computing on high-resolution power consumption. The method -- called pAElla -- targets real-time Malware Detection (MD), it runs on an out-of-band IoT-based monitoring system for DCs/SCs, and involves Power Spectral Density of power measurements, along with AutoEncoders. Results are promising, with an F1-score close to 1, and a False Alarm and Malware Miss rate close to 0%. We compare our method with State-of-the-Art MD techniques and show that, in the context of DCs/SCs, pAElla can cover a wider range of malware, significantly outperforming SoA approaches in terms of accuracy. Moreover, we propose a methodology for online training suitable for DCs/SCs in production, and release open dataset and code.


[ML paper 9/10]

Why you may want to read this: Newest paper from Dacheng Tao (The University of Sydney).

Repulsive Mixture Models of Exponential Family PCA for Clustering.

Maoying Qiao, Tongliang Liu, Jun Yu, Wei Bian, Dacheng Tao

The mixture extension of exponential family principal component analysis (EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA does. For example, due to the linearity of EPCA's essential form, nonlinear cluster structures cannot be easily handled, but they are explicitly modeled by the mixing extensions. However, the traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing components, which may cause ambiguity for data clustering. To alleviate this problem, in this paper, a repulsiveness-encouraging prior is introduced among mixing components and a diversified EPCA mixture (DEPCAM) model is developed in the Bayesian framework. Specifically, a determinantal point process (DPP) is exploited as a diversity-encouraging prior distribution over the joint local EPCAs. As required, a matrix-valued measure for L-ensemble kernel is designed, within which, \ell_1 constraints are imposed to facilitate selecting effective PCs of local EPCAs, and angular based similarity measure are proposed. An efficient variational EM algorithm is derived to perform parameter learning and hidden variable inference. Experimental results on both synthetic and real-world datasets confirm the effectiveness of the proposed method in terms of model parsimony and generalization ability on unseen test data.


[ML paper 10/10]

Why you may want to read this: Newest paper from Pieter Abbeel (UC Berkeley | Covariant.AI).

CURL: Contrastive Unsupervised Representations for Reinforcement Learning.

Aravind Srinivas, Michael Laskin, Pieter Abbeel

We present CURL: Contrastive Unsupervised Representations for Reinforcement Learning. CURL extracts high-level features from raw pixels using contrastive learning and performs off-policy control on top of the extracted features. CURL outperforms prior pixel-based methods, both model-based and model-free, on complex tasks in the DeepMind Control Suite and Atari Games showing 2.8x and 1.6x performance gains respectively at the 100K interaction steps benchmark. On the DeepMind Control Suite, CURL is the first image-based algorithm to nearly match the sample-efficiency and performance of methods that use state-based features.

欢迎订阅论文广播的每日更新版:http://www.buzzsprout.com/632479。     

ArXiv Weekly Radiostation
ArXiv Weekly Radiostation

Weekly selection and podcast of the latest CV,NLP, ML papers.

理论CV机器学习计算机视觉NLP论文
暂无评论
暂无评论~