Papers by Armin Shmilovici
International Conference of Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004), 2019
ArXiv, 2019
Image understanding relies heavily on accurate multi-label classification. In recent years deep l... more Image understanding relies heavily on accurate multi-label classification. In recent years deep learning (DL) algorithms have become very successful tools for multi-label classification of image objects. With these set of tools, various implementations of DL algorithms have been released for the public use in the form of application programming interfaces (API). In this study, we evaluate and compare 10 of the most prominent publicly available APIs in a best-of-breed challenge. The evaluation is performed on the Visual Genome labeling benchmark dataset using 12 well-recognized similarity metrics. In addition, for the first time in this kind of comparison, we use a semantic similarity metric to evaluate the semantic similarity performance of these APIs. In this evaluation, Microsoft's Computer Vision, TensorFlow, Imagga, and IBM's Visual Recognition showed better performance than the other APIs. Furthermore, the new semantic similarity metric allowed deeper insights for compa...
Expert Systems with Applications, 2020
Image understanding heavily relies on accurate multi-label classification. In recent years, deep ... more Image understanding heavily relies on accurate multi-label classification. In recent years, deep learning algorithms have become very successful for such tasks, and various commercial and open-source APIs have been released for public use. However, these APIs are often trained on different datasets, which, besides affecting their performance, might pose a challenge to their performance evaluation. This challenge concerns the different object-class dictionaries of the APIs' training dataset and the benchmark dataset, in which the predicted labels are semantically similar to the benchmark labels but considered different simply because they have different wording in the dictionaries. To face this challenge, we propose semantic similarity metrics to obtain richer understating of the APIs predicted labels and thus their performance. In this study, we evaluate and compare the performance of 13 of the most prominent commercial and open-source APIs in a best-ofbreed challenge on the Visual Genome and Open Images benchmark datasets. Our findings demonstrate that, while using traditional metrics, the Microsoft Computer Vision, Imagga, and IBM APIs performed better than others. However, applying semantic metrics also unveil the InceptionResNet-v2, Inception-v3, and ResNet50 APIs, which are trained only with the simple ImageNet dataset, as challengers for top semantic performers.
Expert Systems with Applications, 2019
Foundations of Computing and Decision Sciences, 2006
The Relevance Vector Machine (RVM) is a method for training sparse generalized linear models, and... more The Relevance Vector Machine (RVM) is a method for training sparse generalized linear models, and its accuracy is comparably to other machine learning techniques. For a dataset of size N the runtime complexity of the RVM is O(N 3) and its space complexity is O(N 2) which makes it too expensive for moderately sized problems. We suggest three different algorithms which partition the dataset into manageable chunks. Our experiments on benchmark datasets indicate that the partition algorithms can significantly reduce the complexity of the RVM while retaining the attractive attributes of the original solution.
We present a new approach for gene finding based on a variable-order Markov (VOM) model. The VOM ... more We present a new approach for gene finding based on a variable-order Markov (VOM) model. The VOM model is a generalization of the traditional Markov model; it is more efficient in terms of its parameterization, and, thus, can be trained on relatively short sequences. As a result, the proposed VOM gene-finder outperforms traditional gene-finders that are based on fifth-order Markov models for short newly sequenced bacterial genomes.
⎯ A data warehouse is a special database used for storing business oriented information for futur... more ⎯ A data warehouse is a special database used for storing business oriented information for future analysis and decision-making. In business scenarios, where some of the data or the business attributes are fuzzy, it may be useful to construct a warehouse that can support the analysis of fuzzy data. Here, we outline how Kimball‘s methodology for the design of a data warehouse can be extended to the construction of a fuzzy data warehouse. A case study demonstrates the viability of the methodology.
Recent Developments and the New Direction in Soft-Computing Foundations and Applications, 2020
With the rapidly increasing number of online video resources, the ability of automatically unders... more With the rapidly increasing number of online video resources, the ability of automatically understanding those videos becomes more and more important, since it is almost impossible for people to watch all of the videos and provide textual descriptions. The duration of online videos varies in a extremely wide range, from several seconds to more than 5 h. In this paper, we focus on long videos, especially on full-length movies, and propose the first pipeline for automatically generating textual summaries of such movies. The proposed system takes an entire movie as input (including subtitles), splits it into scenes, generates a one-sentence description for each scene and summarizes those descriptions and subtitles into a final summary. In our initial experiment on a popular cinema movie (Forrest Gump), we utilize several existing algorithms and software tools for implementing the different components of our system. Most importantly, we use the S2VT (Sequence to Sequence—Video to Text) algorithm for scene description generation and MUSEEC (MUltilingual SEntence Extraction and Compression) for extractive text summarization. We present preliminary results from our prototype experimental framework. An evaluation of the resulting textual summaries for a movie made of 156 scenes demonstrates the feasibility of the approach—the summary contains the descriptions of three out of the four most important scenes/storylines in the movie. Although the summaries are far from satisfactory, we argue that the current results can be used to prove the merit of our approach.
2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 2017
Automatic generation of natural language descriptions for images has recently become an important... more Automatic generation of natural language descriptions for images has recently become an important research topic. In this paper, we propose a frame-based algorithm for generating a composite natural language description for a given image. The goal of this algorithm is to describe not only the objects appearing in the image but also the main activities happening in the image and the objects participating in those activities. The algorithm builds upon a pre-trained CRF (Conditional Random Field)-based structured prediction model, which generates a set of alternative frames for a given image. We use imSitu, a situation recognition dataset with 126,102 images, 504 activities, 11,538 objects, and 1,788 roles, as a test bed of our algorithm. We ask human evaluators to evaluate the quality of the descriptions for 20 images from the imSitu dataset. The results demonstrate that our composite description contains on average 16% more visual elements than the baseline method and gains a significantly higher accuracy score by the human evaluators.
The Relevance Vector Machine (RVM) is a generalized linear model that can use kernel functions as... more The Relevance Vector Machine (RVM) is a generalized linear model that can use kernel functions as basis functions. The typical RVM solution is very sparse. We present a strategy for feature ranking and selection via evaluating the influence of the features on the relevance vectors. This requires a single training of the RVM, thus, it is very efficient. Experiments on a benchmark regression problem provide evidence that it selects high-quality feature sets at a fraction of the costs of classical methods. Key-Words: Feature Selection, Relevance Vector Machine, Machine Learning
International Conference of Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004), 2019
ArXiv, 2019
Image understanding relies heavily on accurate multi-label classification. In recent years deep l... more Image understanding relies heavily on accurate multi-label classification. In recent years deep learning (DL) algorithms have become very successful tools for multi-label classification of image objects. With these set of tools, various implementations of DL algorithms have been released for the public use in the form of application programming interfaces (API). In this study, we evaluate and compare 10 of the most prominent publicly available APIs in a best-of-breed challenge. The evaluation is performed on the Visual Genome labeling benchmark dataset using 12 well-recognized similarity metrics. In addition, for the first time in this kind of comparison, we use a semantic similarity metric to evaluate the semantic similarity performance of these APIs. In this evaluation, Microsoft's Computer Vision, TensorFlow, Imagga, and IBM's Visual Recognition showed better performance than the other APIs. Furthermore, the new semantic similarity metric allowed deeper insights for compa...
Expert Systems with Applications, 2020
Image understanding heavily relies on accurate multi-label classification. In recent years, deep ... more Image understanding heavily relies on accurate multi-label classification. In recent years, deep learning algorithms have become very successful for such tasks, and various commercial and open-source APIs have been released for public use. However, these APIs are often trained on different datasets, which, besides affecting their performance, might pose a challenge to their performance evaluation. This challenge concerns the different object-class dictionaries of the APIs' training dataset and the benchmark dataset, in which the predicted labels are semantically similar to the benchmark labels but considered different simply because they have different wording in the dictionaries. To face this challenge, we propose semantic similarity metrics to obtain richer understating of the APIs predicted labels and thus their performance. In this study, we evaluate and compare the performance of 13 of the most prominent commercial and open-source APIs in a best-ofbreed challenge on the Visual Genome and Open Images benchmark datasets. Our findings demonstrate that, while using traditional metrics, the Microsoft Computer Vision, Imagga, and IBM APIs performed better than others. However, applying semantic metrics also unveil the InceptionResNet-v2, Inception-v3, and ResNet50 APIs, which are trained only with the simple ImageNet dataset, as challengers for top semantic performers.
Expert Systems with Applications, 2019
Highlights • This paper proposes a methodology for turning points detection in movies. • The meth... more Highlights • This paper proposes a methodology for turning points detection in movies. • The methodology builds upon drifts between the event clock and the weighted clock. • Only the movie subtitles are used as input. • Encouraging results are obtained on 28 episodes of a popular cartoon series. • The methodology is capable of discovering additional story elements in a movie.
Foundations of Computing and Decision Sciences, 2006
The Relevance Vector Machine (RVM) is a method for training sparse generalized linear models, and... more The Relevance Vector Machine (RVM) is a method for training sparse generalized linear models, and its accuracy is comparably to other machine learning techniques. For a dataset of size N the runtime complexity of the RVM is O(N 3) and its space complexity is O(N 2) which makes it too expensive for moderately sized problems. We suggest three different algorithms which partition the dataset into manageable chunks. Our experiments on benchmark datasets indicate that the partition algorithms can significantly reduce the complexity of the RVM while retaining the attractive attributes of the original solution.
Occupational factors have long been linked to patterns of mortality. Based on the premise that an... more Occupational factors have long been linked to patterns of mortality. Based on the premise that an entry in an encyclopedia tends to imply success in one's vocation, we used Wikipedia biographical entries (English version) to elucidate the relationship between career success and longevity. Analyzing 7756 Wikipedia entries for persons deceased between 2009 and 2011 in terms of gender, occupation and longevity, we found that male entries outnumbered female (6548 vs. 1208), and the mean age of death was lower for males than females (76.31 vs. 78.50 years). Younger ages of death were evident among sports players and Performing artists (73.04) and creative workers (74.68). Older deaths were seen in professionals and academics (82.63). Since these results are comparable with those found in the literature, they validate the use of Wikipedia for population studies. The gender classification procedure we developed for the biographical entries in order to obtain an occupation-by-gender com...
We present a new approach for gene finding based on a variable-order Markov (VOM) model. The VOM ... more We present a new approach for gene finding based on a variable-order Markov (VOM) model. The VOM model is a generalization of the traditional Markov model; it is more efficient in terms of its parameterization, and, thus, can be trained on relatively short sequences. As a result, the proposed VOM gene-finder outperforms traditional gene-finders that are based on fifth-order Markov models for short newly sequenced bacterial genomes.
21st IEEE Convention of the Electrical and Electronic Engineers in Israel. Proceedings (Cat. No.00EX377), 2000
ABSTRACT We propose to extend the use of Risannen's (1983) “tree source”-a relative of th... more ABSTRACT We propose to extend the use of Risannen's (1983) “tree source”-a relative of the partial hidden Markov model-to continuous signals. While the original algorithm is dedicated to modeling the context in which each symbol can occur in a discrete symbol space, we propose to match a specific ARMA model with each identified context. An example is presented
Occupational factors have long been linked to patterns of mortality. Based on the premise that an... more Occupational factors have long been linked to patterns of mortality. Based on the premise that an entry in an encyclopedia tends to imply success in one's vocation, we used Wikipedia biographical entries (English version) to elucidate the relationship between career success and longevity. Analyzing 7756 Wikipedia entries for persons deceased between 2009 and 2011 in terms of gender, occupation and longevity, we found that male entries outnumbered female (6548 vs. 1208), and the mean age of death was lower for males than females (76.31 vs. 78.50 years). Younger ages of death were evident among sports players and Performing artists (73.04) and creative workers (74.68). Older deaths were seen in professionals and academics (82.63). Since these results are comparable with those found in the literature, they validate the use of Wikipedia for population studies. The gender classification procedure we developed for the biographical entries in order to obtain an occupation-by-gender com...
21st IEEE Convention of the Electrical and Electronic Engineers in Israel. Proceedings (Cat. No.00EX377), 2000
ABSTRACT We propose to extend the use of Risannen's (1983) “tree source”-a relative of th... more ABSTRACT We propose to extend the use of Risannen's (1983) “tree source”-a relative of the partial hidden Markov model-to continuous signals. While the original algorithm is dedicated to modeling the context in which each symbol can occur in a discrete symbol space, we propose to match a specific ARMA model with each identified context. An example is presented
Uploads
Papers by Armin Shmilovici