Search

Busa, Roberto

The ONLI (Osservatorio Neologico della Lingua Italiana), established in 1998, aims to study the Italian vocabulary and its evolution between the 20th and 21st centuries, analyzing neologisms by using investigative methods and the rules that describe how new words are formed and applying them to newspaper quotations from between 1998 and 2019 collected in its own data base. The classification of neologisms adopted by ONLI enables the trends in Italian vocabulary to be highlighted also through a comparison of opinions of scholars from related areas.

Parole nuove, Le parole dell'italiano

Exploratory analysis of images engraved on ancient near-eastern seals based on a distance among strings

For the structure analysis of images engraved on ancient Near-Eastern seals, a method was developed able of computing the distance among strings of symbols repre-senting each image structure. In strings, couples of parentheses enclose substrings corresponding to subpattern, that are outlined in this way. The procedure examines operations of deletion, insertion, and substitution, necessary to map a string into another. The method uses a set of weights to be attributed to each operation in order to derive a distance based on these weights. The distance matrix was submitted to principal coordinates analysis, followed by a hierarchical classification of seals according to the first three principal axes coordinates. The results of the pilot analysis performed on a sample of a hundred seals are highly promising for further developments.

Éléments d'un modèle pour la description des lexiques documentaires

La lexicographie naturelle qui vise à observer certains faits concernant les occurrences de mots dans une langue donnée (glossaires, thesaurus), se définit par opposition à la lexicographie documentaire qui aboutit à une liste de termes organisés ou non, destinés à l'indexation automatique ou non, et dont les classifications sont un groupe parmi d'autres. Le fondement des classifications peut être d'organisation sémantique (ordre des termes fondé sur l'essence des entités qu'ils désignent, le cas-limite étant la taxinomie), ou d'organisation syntaxique (ordre des termes fondé sur la fonction des entités dans un champ d'observation déterminé, le cas-limite étant la facette)

Methods for the Descriptive Analysis of Archaeological Material

Most studies on the use of punched cards and computers in archaeology seem to take for granted that scientific standards exist to express the data upon which algorithms are to be performed, for retrieval or classification purposes. The author's view is different; examples are given of descriptive codes which have been designed under his direction since 1955 for the storage of archaeological data (artifacts, abstract or figured representations, buildings, etc.) on punched cards of various kinds (marginal, peek-a-boo, IBM, etc.). In order to obviate the shortcomings of natural language, three categories of rules are required: orientation, segmentation, differentiation. The concluding remarks concern the relation of the descriptive languages which are thus obtained to scientific language in general; differences are stressed, as well as reasons for postulating a continuum from the former to the latter.

Generalized procrustes analysis

Data Description and the Integrated Study of Ancient Near Eastern Works of Art: The Potential of Cylinder Seals

Scene-in-frammenti: una proposta di analisi delle "scene di presentazione" dei sigilli a cilindro mesopotamici orientata all'elaborazione statistica ed informatica dei dati

Spad.T – Version 1.5 – Manuel de référence

Analyse statistique des données textuelles.Questions ouvertes et lexicométrie

Die Einführungsszene: Entwicklung eines mesopotamischen Motivs von der altakkadischen bis zum Ende der altbabylonischen Zeit

The King and the Cup: Iconography of the Royal Presentation Scene on Ur III Seals. Insight through Images: Studies in Honor of Edith Porada.

In the Autumn of 195 9, I was permitted as an und ergraduate to participat e in a graduate seminar on Royal Ico nography taught by Edit h Porada at Columbia University. Th e topic I was given for this, my first class presentati on, was an

Analysis of ancient Near-Eastern cylinder seals (Late Fourth Millennium B.C.)

Procrustes Problems

Procrustean methods are used to transform one set of data to represent another set of data as closely as possible. The name derives from the Greek myth where Procrustes invited passers-by in for a pleasant meal and a night's rest on a magical bed that would exactly fit any guest. He then either stretched the guest on the rack or cut off their legs to make them fit perfectly into the bed. Theseus turned the tables on Procrustes, fatally adjusting him to fit his own bed.This text, the first monograph on Procrustes methods, unifies several strands in the literature and contains much new material. It focuses on matching two or more configurations by using orthogonal, projection and oblique axes transformations. Group-average summaries play an important part and links with other group-average methods are discussed. This is the latest in the well-established and authoritative Oxford Statistical Science Series, which includes texts and monographs covering many topics of current research interest in pure and applied statistics. Each title has an original slant even if the material included is not specifically original. The authors are leading researchers and the topics covered will be of interest to all professional statisticians, whether they be in industry, government department or research institute. Other books in the series include 23. W.J.Krzanowski: Principles of multivariate analysis: a user's perspective updated edition 24. J.Durbin and S.J.Koopman: Time series analysis by State Space Models 25. Peter J. Diggle, Patrick Heagerty, Kung-Yee Liang, Scott L. Zeger: Analysis of Longitudinal Data 2/e 26. J.K. Lindsey: Nonlinear Models in Medical Statistics 27. Peter J. Green, Nils L. Hjort & Sylvia Richardson: Highly Structured Stochastic Systems 28. Margaret S. Pepe: The Statistical Evaluation of Medical Tests for Classification and Prediction 29. Christopher G. Small and Jinfang Wang: Numerical Methods for Nonlinear Estimating Equations

Reconstructing Lexicography in Glyptic Art: Structural Relations between the Akkadian age and the Ur III Period

Art History of the Ancient Near East and Mathematical Models. An Overview

The use of mathematical models in the art history of the pre-classic Near East is still comparatively little popular, partly because of cultural as well as technical and logical problems. In the history of research, such kind of approaches have been

Ancient Mesopotamian Glyptic Products, Statistics and Data Mining: A Research Proposal

A stratified and complex investigation of the figurative language of a corpus of Mesopotamian glyptic artefacts will be described here. The methodologies adopted and the formal description of the products under investigation are the result of a

Legitimation of Authority Through Image and Legend: Seals Belonging to Officials in the Administrative Bureaucracy of the Ur 3. State

Comparative Use of Mathematical Models in an Investigation on Mesopotamian Cylinder Seals

Statistique textuelle

La statistique textuelle, en plein développement, est à la croisée de plusieurs disciplines: la statistique classique, la linguistique, l'analyse du discours, l'informatique, le traitement des enquêtes. En effet, chercheurs et praticiens ont aujourd'hui à faire face à un double développement, d'une part celui des textes provenant des enquêtes, des entretiens, des archives, des bases documentaires, d'autre part, celui des outils informatiques de saisie et de gestion de textes. La statistique textuelle se veut précisément un outil destiné à parfaire l'analyse, la description, la comparaison, en un mot, le traitement des textes. Ce livre, illustré d'exemples nombreux, présente les concepts de base et les fondements des méthodes de la statistique textuelle. Il combine une approche pédagogique des outils et un exposé sur l'état de l'art de cette discipline.

The Guttman effect: Its interpretation and a new redressing method

Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMAP, and PaCMAP for Data Visualization

Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMAP have demonstrated impressive visualization performance on many real world datasets. One tension that has always faced these methods is the trade-off between preservation of global structure and preservation of local structure: these methods can either handle one or the other, but not both. In this work, our main goal is to understand what aspects of DR methods are important for preserving both local and global structure: it is difficult to design a better method without a true understanding of the choices we make in our algorithms and their empirical impact on the lower-dimensional embeddings they produce. Towards the goal of local structure preservation, we provide several useful design principles for DR loss functions based on our new understanding of the mechanisms behind successful DR methods. Towards the goal of global structure preservation, our analysis illuminates that the choice of which components to preserve is important. We leverage these insights to design a new algorithm for DR, called Pairwise Controlled Manifold Approximation Projection (PaCMAP), which preserves both local and global structure. Our work provides several unexpected insights into what design choices both to make and avoid when constructing DR algorithms.

Nonlinear Dimensionality Reduction by Locally Linear Embedding

Many areas of science depend on exploratory data analysis and visualization. The need to analyze large amounts of multivariate data raises the fundamental problem of dimensionality reduction: how to discover compact representations of high-dimensional data. Here, we introduce locally linear embedding (LLE), an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs. Unlike clustering methods for local dimensionality reduction, LLE maps its inputs into a single global coordinate system of lower dimensionality, and its optimizations do not involve local minima. By exploiting the local symmetries of linear reconstructions, LLE is able to learn the global structure of nonlinear manifolds, such as those generated by images of faces or documents of text.

Time as a Hidden Dimension in Archaeological Information Systems: Spatial Analysis Within and Without the Geographic Framework

Time is an indispensable element in most archaeological studies. However, GIS models cannot easily accommodate various issues raising from the specifics of archaeological dating. The formation processes and post-depositional transformations that have affected the present nature of the archaeological record must be assessed prior to designing any GIS project in archaeology. This paper highlights some issues emanating from this matter for a GIS user and introduces an approach that may enrich the current spectrum of spatial techniques in archaeology. The traditional intra-site spatial analysis based on the concept of geographical space is complemented with new experiments, where the spatial investigation is understood more broadly. An attempt is made to map a multidimensional formal space in GIS, which has its coordinate system defined by the proncipal component factor analysis conducted on mortuary data. The exposition demonstrates that GIS can successfully model many archaeological phenomena, be they primarily geographic or not. The key idea here is that GIS tools are able to analyze general problems including those not related to geography, on the condition that they can be translated into models of spatial nature (e.g. some formal topological model).

Considerably Improving Clustering Algorithms Using UMAP Dimensionality Reduction Technique: A Comparative Study

Dimensionality reduction is widely used in machine learning and big data analytics since it helps to analyze and to visualize large, high-dimensional datasets. In particular, it can considerably help to perform tasks like data clustering and classification. Recently, embedding methods have emerged as a promising direction for improving clustering accuracy. They can preserve the local structure and simultaneously reveal the global structure of data, thereby reasonably improving clustering performance. In this paper, we investigate how to improve the performance of several clustering algorithms using one of the most successful embedding techniques: Uniform Manifold Approximation and Projection or UMAP. This technique has recently been proposed as a manifold learning technique for dimensionality reduction. It is based on Riemannian geometry and algebraic topology. Our main hypothesis is that UMAP would permit to find the best clusterable embedding manifold, and therefore, we applied it as a preprocessing step before performing clustering. We compare the results of many well-known clustering algorithms such ask-means, HDBSCAN, GMM and Agglomerative Hierarchical Clustering when they operate on the low-dimension feature space yielded by UMAP. A series of experiments on several image datasets demonstrate that the proposed method allows each of the clustering algorithms studied to improve its performance on each dataset considered. Based on Accuracy measure, the improvement can reach a remarkable rate of 60%.

A New Basis for Sparse Principal Component Analysis

A Guide for Sparse PCA: Model Comparison and Applications

PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify the nonzero coefficients in the components, including rotation-thresholding methods and, more recently, PCA methods subject to sparsity inducing penalties or constraints. Here, we offer guidelines on how to choose among the different sparse PCA methods. Current literature misses clear guidance on the properties and performance of the different sparse PCA methods, often relying on the misconception that the equivalence of the formulations for ordinary PCA also holds for sparse PCA. To guide potential users of sparse PCA methods, we first discuss several popular sparse PCA methods in terms of where the sparseness is imposed on the loadings or on the weights, assumed model, and optimization criterion used to impose sparseness. Second, using an extensive simulation study, we assess each of these methods by means of performance measures such as squared relative error, misidentification rate, and percentage of explained variance for several data generating models and conditions for the population model. Finally, two examples using empirical data are considered.

The Use of Multiple Measurements in Taxonomic Problems

The articles published by the Annals of Eugenics (1925–1954) have been made available online as an historical archive intended for scholarly use. The work of eugenicists was often pervaded by prejudice against racial, ethnic and disabled groups. The online publication of this material for scholarly research purposes is not an endorsement of those views nor a promotion of eugenics in any way.

Analysis of a complex of statistical variables into principal components

Large Scale Geochemical Signatures Enable to Determine Landscape Use in the Deserted Medieval Villages

Multidimensional Scaling

Outlines a set of techniques that enables a researcher to explore the hidden structure of large databases. These techniques use proximities to find a configu

Assessing single-cell transcriptomic variability through density-preserving data visualization

Nonlinear data visualization methods, such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP), summarize the complex transcriptomic landscape of single cells in two dimensions or three dimensions, but they neglect the local density of data points in the original space, often resulting in misleading visualizations where densely populated subsets of cells are given more visual space than warranted by their transcriptional diversity in the dataset. Here we present den-SNE and densMAP, which are density-preserving visualization tools based on t-SNE and UMAP, respectively, and demonstrate their ability to accurately incorporate information about transcriptomic variability into the visual interpretation of single-cell RNA sequencing data. Applied to recently published datasets, our methods reveal significant changes in transcriptomic variability in a range of biological processes, including heterogeneity in transcriptomic variability of immune cells in blood and tumor, human immune cell specialization and the developmental trajectory of Caenorhabditis elegans. Our methods are readily applicable to visualizing high-dimensional data in other scientific domains.

Learning feature representation of Iberian ceramics with automatic classification models

In Cultural Heritage inquiries, a common requirement is to establish time-based trends between archaeological artifacts belonging to different periods of a given culture, enabling among other things to determine chronological inferences with higher accuracy and precision. Among these, pottery vessels are significantly useful, given their relative abundance in most archaeological sites. However, this very abundance makes difficult and complex an accurate representation, since no two of these vessels are identical, and therefore classification criteria must be justified and applied. For this purpose, we propose the use of deep learning architectures to extract automatically learned features without prior knowledge or engineered features. By means of transfer learning, we retrained a Residual Neural Network with a binary image database of Iberian wheel-made pottery vessels’ profiles. These vessels pertain to archaeological sites located in the upper valley of the Guadalquivir River (Spain). The resulting model can provide an accurate feature representation space, which can automatically classify profile images, achieving a mean accuracy of 0.96 with an f-measure of 0.96. This accuracy is remarkably higher than other state-of-the-art machine learning approaches, where several feature extraction techniques were applied together with multiple classifier models. These results provide novel strategies to current research in automatic feature representation and classification of different objects of study within the Archaeology domain.

Multidimensional Scaling of Northwest Coast Faunal Assemblages: A Case Study from Southern Haida Gwaii, British Columbia

Multidimensional scaling (MDS) has been previously applied successfully to the analysis of artifact assemblages from archaeological contexts. Despite the suitability of archaeological faunal data to such analysis, MDS has not been applied to faunal data. In this study, MDS analysis was applied to 21 faunal assemblages from 14 Graham Tradition sites in the Kunghit region of southern Haida Gwaii. A separation of salmon-dominated and rockfish-dominated assemblages provided the strongest result of this analysis, strengthening previous interpretations made for these data. Additionally, MDS analysis revealed functional and regional variability that had not been previously identified. Functionality was reflected in the separation of differing site types, while regional distribution of resources was also highlighted by the analysis. These results contribute to an understanding of Kunghit Haida subsistence and settlement while demonstrating the utility of MDS for faunal analysis. Dans le passé, le « multidimensional scaling » (MDS) a été utilisé avec succès pour analyser des ensembles d'artefacts dans nombreux contextes archéologiques. Malgré l'apparente pertinence d'une telle analyse pour les données fauniques, le MDS n'a pas été appliqué aux études fauniques. Dans cette étude, nous avons utilisé le MDS pour analyser 21 collections fauniques provenant de 14 sites de la tradition Graham, dans la région de Kunghit dans le sud de Haida Gwaii. La séparation entre les ensembles dominés par le saumon et ceux dominés par le sébaste est le résultat le plus pertinent de notre étude, appuyant ainsi les interprétations antérieures de ces données. De plus, l'analyse MDS a révélé pour la première fois une variabilité fonctionnelle et régionale. La variabilité fonctionnelle s'est reflétée par la reconnaissance de différents types de sites, tandis que l'analyse a permis la mise en évidence de la distribution des ressources dans la région. Ces résultats contribuent à la compréhension des schemes d'établissement et de subsistance des Kunghit Haida, tout en démontrant l'utilité du MDS pour les analyses fauniques.

LIII. On lines and planes of closest fit to systems of points in space

Incremental Learning for Robust Visual Tracking

Visual tracking, in essence, deals with non-stationary image streams that change over time. While most existing algorithms are able to track objects well in controlled environments, they usually fail in the presence of significant variation of the object’s appearance or surrounding illumination. One reason for such failures is that many algorithms employ fixed appearance models of the target. Such models are trained using only appearance data available before tracking begins, which in practice limits the range of appearances that are modeled, and ignores the large volume of information (such as shape changes or specific lighting conditions) that becomes available during tracking. In this paper, we present a tracking method that incrementally learns a low-dimensional subspace representation, efficiently adapting online to changes in the appearance of the target. The model update, based on incremental algorithms for principal component analysis, includes two important features: a method for correctly updating the sample mean, and a forgetting factor to ensure less modeling power is expended fitting older observations. Both of these features contribute measurably to improving overall tracking performance. Numerous experiments demonstrate the effectiveness of the proposed tracking algorithm in indoor and outdoor environments where the target objects undergo large changes in pose, scale, and illumination.

A Survey on Multidimensional Scaling

This survey presents multidimensional scaling (MDS) methods and their applications in real world. MDS is an exploratory and multivariate data analysis technique becoming more and more popular. MDS is one of the multivariate data analysis techniques, which tries to represent the higher dimensional data into lower space. The input data for MDS analysis is measured by the dissimilarity or similarity of the objects under observation. Once the MDS technique is applied to the measured dissimilarity or similarity, MDS results in a spatial map. In the spatial map, the dissimilar objects are far apart while objects which are similar are placed close to each other. In this survey article, MDS is described in comprehensive fashion by explaining the basic notions of classical MDS and how MDS can be helpful to analyze the multidimensional data. Later on, various special models based on MDS are described in a more mathematical way followed by comparisons of various MDS techniques.

Can Lithic Attribute Analyses Identify Discrete Reduction Trajectories? A Quantitative Study Using Refitted Lithic Sets

Quantitative, attribute-based analyses of stone tools (lithics) have been frequently used to facilitate large-scale comparative studies, attempt to mitigate problems of assemblage completeness and address interpretations of the co-occurrence of unrelated technological processes. However, a major barrier to the widespread acceptance of such methods has been the lack of quantified experiments that can be externally validated by theoretically distinct approaches in order to guide analysis and confidence in results. Given that quantitative, attribute-based studies now underpin several major interpretations of the archaeological record, the requirement to test the accuracy of such methods has become critical. In this paper, we test the utility of 31 commonly used flake attribute measurements for identifying discrete reduction trajectories through three refitted lithic sets from the Middle Palaeolithic open-air site of Le Pucheuil, in northern France. The experiment had three aims: (1) to determine which, if any, attribute measurements could be used to separate individual refitted sets, (2) to determine whether variability inherent in the assemblage was primarily driven by different reduction trajectories, as represented by the refitted sets, or other factors, and (3) to determine which multivariate tests were most suitable for these analyses. In order to test the sensitivity of the sample, we ran all analyses twice, the first time with all the available lithics pertaining to each refitted set and the second time with randomly generated 75 % subsamples of each set. All results revealed the consistent accuracy of 16 attribute measurements in quadratic and linear discriminant analyses, principal component analyses and dissimilarity matrices. These results therefore provide the first quantified attribute formula for comparative analyses of Levallois reduction methods and a basis from which further experiments testing core and retouch attributes may be conducted.

The computer-aided craniofacial reconstruction (CFR) technique has been widely used in the fields of criminal investigation, archaeology, anthropology and cosmetic surgery. The evaluation of craniofacial reconstruction results is important for improving the effect of craniofacial reconstruction. Here, we used the sparse principal component analysis (SPCA) method to evaluate the similarity between two sets of craniofacial data. Compared with principal component analysis (PCA), SPCA can effectively reduce the dimensionality and simultaneously produce sparse principal components with sparse loadings, thus making it easy to explain the results. The experimental results indicated that the evaluation results of PCA and SPCA are consistent to a large extent. To compare the inconsistent results, we performed a subjective test, which indicated that the result of SPCA is superior to that of PCA. Most importantly, SPCA can not only compare the similarity of two craniofacial datasets but also locate regions of high similarity, which is important for improving the craniofacial reconstruction effect. In addition, the areas or features that are important for craniofacial similarity measurements can be determined from a large amount of data. We conclude that the craniofacial contour is the most important factor in craniofacial similarity evaluation. This conclusion is consistent with the conclusions of psychological experiments on face recognition and our subjective test. The results may provide important guidance for three- or two-dimensional face similarity evaluation, analysis and face recognition.

A Global Geometric Framework for Nonlinear Dimensionality Reduction

Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 106 optic nerve fibers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

Evaluating prepared core assemblages with three-dimensional methods: a case study from the Middle Paleolithic at Skhūl (Israel)

Levallois technology is a hallmark of many Middle and Late Pleistocene stone artifact assemblages, but its definition has been much debated. Here we use three-dimensional photogrammetry to investigate the geometric variation among Levallois and discoidal core technologies. We created models of experimental and archaeological stone artifact assemblages to quantitatively investigate the morphologies of Levallois and discoidal core technologies. Our results demonstrate that technological characteristics of Levallois technology can be distinguished from discoidal variants by analyzing the relative volumes and angles of the two flaking surfaces. We apply these methods to a random subset of Middle Paleolithic cores from Skhūl (Israel) and show that, overall, the Skhūl archaeological sample falls in range with the experimental Levallois sample. This study advocates the investigation of core technology on a spectrum to elucidate particular reduction trajectories while maintaining visible outliers and dispersion within an assemblage. Our quantified approach to studying centripetal core technology broadly is particularly applicable in studies related to forager mobility strategy and raw material use. Ultimately, the methods developed here can be used across temporal and geographic boundaries and facilitate attribute-based inter-site comparisons.

Landscape Classification using Principal Component Analysis and Fuzzy Classification: Archaeological Sites and their Natural Surroundings in Central Mongolia

The middle and upper Orkhon Valley in Central Mongolia (47.5°N, 102.5°E) hosts a multitude of diverse archaeological features. Most of them – including the well-known ancient cities of Karakorum and Karabalgasun – have only rarely been described in their geographical setups. The aim of this study is to describe, classify and analyse their surrounding landscapes and consequently characterise these sites geographically. This analysis is based on freely available raster datasets that offer information about topography, surface reflectance and derivatives. Principal component analysis is applied as a dimensional reduction technique. Subsequently, a fuzzy-logic approach leads to a classification scheme in which archaeological features are embedded and therefore distinguishable. A distinct difference in preferences regarding to choose a site location can be made and confirmed by semiautomatic analysis, comparing burial and ritual places and settlements. Walled enclosures and settlements are connected to planar steppe regions, whereas burial and ritual places are embedded in mountainous and hilly environments.

Understanding standardization and variation in Mediterranean ceramics

This volume is designed as a wide-ranging analysis of ceramic standardization and variation, and as a contribution to pottery studies in the Mediterranean and beyond. It originates in a conference session in the 16th annual Meeting of the European

Principal Component Analysis

Principal component analysis is central to the study of multivariate data. Although one of the earliest multivariate techniques, it continues to be the subject of much research, ranging from new model-based approaches to algorithmic ideas from neural networks. It is extremely versatile, with applications in many disciplines. The first edition of this book was the first comprehensive text written solely on principal component analysis. The second edition updates and substantially expands the original version, and is once again the definitive text on the subject. It includes core material, current research and a wide range of applications. Its length is nearly double that of the first edition. Researchers in statistics, or in other fields that use principal component analysis, will find that the book gives an authoritative yet accessible account of the subject. It is also a valuable resource for graduate courses in multivariate analysis. The book requires some knowledge of matrix algebra. Ian Jolliffe is Professor of Statistics at the University of Aberdeen. He is author or co-author of over 60 research papers and three other books. His research interests are broad, but aspects of principal component analysis have fascinated him and kept him busy for over 30 years.

La necropoli laziale di Osteria dell'Osa

La necropoli dell’età del ferro di Osteria dell’Osa (Roma, km 17,500 della via Prenestina), datata fra il IX e gli inizi del VI sec. a.C., è un complesso di 600 sepolture a incinerazione e a inumazione portate alla luce con quindici anni di scavi archeologici sistematici. L’opera comprende un’ampia sezione archeologica, con capitoli dedicati alla geografia storica e alla geo-morfologia dell’area della necropoli, allo scavo, agli aspetti teorici e metodologici dello studio di necropoli, alla classificazione dei materiali, al rituale, alla ricostruzione della struttura e organizzazione della comunità e dei suoi rapporti con le regioni vicine. Sono presenti inoltre un'analisi antropologica completa dei resti scheletrici, corredata da studi sullo stato di conservazione e sugli isotopi stabili delle ossa, analisi fisico-chimiche della ceramica e dei metalli e un esperimento di riproduzione della ceramica.

Visualizing data using t-SNE

We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large data sets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of data sets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the data sets

Exploratory data analysis

This book serves as an introductory text for exploratory data analysis. It exposes readers and users to a variety of techniques for looking more effectively at data. The emphasis is on general techniques, rather than specific problems

The Reliability and Validity of a Lithic Debitage Typology: Implications for Archaeological Interpretation

Sullivan and Rozen's (1985) debitage typology has been proposed as a method for measuring the effects of variation in lithic reduction by describing “distinctive assemblages.” This is in contrast to many traditional analytical methods oriented toward identifying the effects of lithic reduction techniques on individual flakes. Debate over the use of the typology has focused primarily on the ability of the typology to accurately measure variation in lithic reduction behavior, and secondarily on the role of experimental studies in archaeology. In this paper I present an analysis designed to estimate the reliability and validity of the typology. An experimental design is developed to permit data collection with minimal analyst induced random or systematic error. Principal components analysis and the coefficient theta demonstrate that the typology provides reliable or replicable results when applied to debitage assemblages of similar technological origin. Further principal components analysis suggests that the instrument is of limited utility in recognizing effects of variation in reduction activities associated with highly vitreous lithic raw materials. A means of expanding the typology and increasing its accuracy in archaeological pattern recognition is presented., RésuméLa tipología de los restos líticos de Sullivan y Rozen (1985) ha sido propuesta como un método por medir el efecto de la variabilidad en “reducción distintiva” de cada conjunto. Este mótodo contrasta con otros mátodos analitícos tradicionales, que están orientados a la identificción de los efectos de las técnicas de reducción lítica en las lascas individuates. El debate sobre los usos de la tipología se har enfocado primeramente en la abilidad de la tipología para de tectar exactamente la variación en el proceso de reducción lítica, y segundamente en elpapel de los estudios arqueológicos experimentales. En este artículo, presento un análisis diseñado para estimar la confiabilidad y validez de la tipología. Un diseho experimental estd desarollado para permitir lafue colleción de datos con un mmímo de errores inducidos casual y sistematicamente. La análisis de los componentes principales y el coeficiente theta demuestra que la tipología produce resultados confiables y reproducible cuando se aplica a conjuntos de restos de origen tecnico similar. El andlisis adicional de los componentes principales sugiere que la tipología es de utilidad limitada para reconocer los efectos de variación en actividades de reducción. Este papel amplía la tipologia y aumenta la precisión para reconocer patrones arqueológicos.

An analytical strategy based on Fourier transform infrared spectroscopy, principal component analysis and linear discriminant analysis to suggest the botanical origin of resins from Bursera. Application to archaeological Aztec Samples

Bursera species are the source of oleoresins that have been used by pre-Columbian American cultures as adhesives, raw materials for molding figurines, ritual offerings, among other uses. Spread along different museum collections all over the world, pre-Columbian artefacts contain these resins. The preservation and understanding of the technology of fabrication of these pieces constitute a major concern for conservators, historians and archaeologists. Few studies have so far dealt with the chemical composition and the botanical origin of Mexican copal, owing maybe to the difficulty on the procuration of resins from known botanical origin. In this work, fresh resins from six Mexican Bursera species, namely B. bipinnata, B. excelsa, B. grandifolia, B. laxiflora, B. penicillata and B. stenophylla, were analyzed by Fourier-transformed infrared spectroscopy (FTIR). Main spectral band positions were selected for chemometric analysis using principal component analysis (PCA), based on the loading plot of chemometric analysis. Sample distribution patterns were investigated with PCA. Score plots revealed a sample agglomeration with good differentiation in 5 out of the 6 species. This method was validated by linear discriminant analysis (LDA) with a 95.2% of global positive recognition for certified origin species. To compare the efficiency of this approach, high performance liquid chromatography coupled to diode array detection (HPLC-DAD) and FTIR results were coupled to PCA and LDA, for the same set of samples. “FTIR showed 94.4% of samples correctly assigned on the confusion matrix and 91% on the cross validation one. HPLC-LDA showed 100% of correct assignment in the confusion matrix and 95% on the cross validation one. These results are encouraging, as FTIR is much faster and less expensive than chromatographic techniques and it could more readily be available in conservation laboratories. Finally, an application to the identification of the botanical origin of four archaeological Aztec copal samples was performed and the model suggested an origin on B. bipinnata/B. stenophylla for these archaeological samples.

Introduzione alla protostoria italiana

Non-Euclidean Distances in Point Pattern Analysis: Anisotropic Measures for the Study of Settlement Networks in Heterogeneous Regions

In the statistical analysis of spatial point patterns, stationarity is often assumed to mean that the spatial point process has constant intensity and uniform correlation depending only on the lag vector between pairs of points. In other words, it is assumed that the correlation between the elements of a spatial distribution is a function of the Euclidean distance between them. This framework has been vastly used in Spatial Analysis to describe settlement processes, taking into account a homogenous and undifferentiated surface that is easy to generalise. These assumptions fail when we consider the historical and economical dynamics that took place in space.

Specialization, standardization and diversity: A retrospective

Quantitative Analysis in Archaeology

Quantitative Analysis in Archaeology introduces the application of quantitative methods in archaeology. It outlines conceptual and statistical principles, illustrates their application, and provides problem sets for practice. Discusses both methodological frameworks and quantitative methods of archaeological analysisPresents statistical material in a clear and straightforward manner ideal for students and professionals in the fieldIncludes illustrative problem sets and practice exercises in each chapter that reinforce practical application of quantitative analysis

Sparse Principal Component Analysis

Principal component analysis (PCA) is widely used in data processing and dimensionality reduction. However, PCA suffers from the fact that each principal component is a linear combination of all the original variables, thus it is often difficult to interpret the results. We introduce a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings. We first show that PCA can be formulated as a regression-type optimization problem; sparse loadings are then obtained by imposing the lasso (elastic net) constraint on the regression coefficients. Efficient algorithms are proposed to fit our SPCA models for both regular multivariate data and gene expression arrays. We also give a new formula to compute the total variance of modified principal components. As illustrations, SPCA is applied to real and simulated data with encouraging results.

Characterization of Iron age pottery from eastern Turkey by laser- induced breakdown spectroscopy (LIBS)

Selected pottery sherds coming from the Ayanis, Dilkaya and Karagündüz excavations in eastern Turkey and dated from the Early to Middle Iron Age were examined as regards their composition by using laser-induced breakdown spectroscopy (LIBS). The objective of the study was first to investigate the potential of the LIBS technique in the compositional analysis of pottery and further to explore correlations in spectral data, by using chemometrics methods that would possibly enable discrimination among different sherds. This work is part of a broader study aiming to examine clay variability both before and during the Urartian State period and to explore possible relationships and differences among pottery objects from fortresses and settlements or settlements and cemeteries on the basis of the clay composition of sherds. Preliminary results demonstrate that by using the LIBS technique it is possible to analyse pottery sherds in qualitative and semi-quantitative ways, providing information on the clay and slip composition. Furthermore, encouraging results have been obtained by carrying out principal component analysis (PCA) on the LIBS spectra, which suggest that in certain cases, it is possible to directly correlate spectral information with the origin of pottery sherds.

Review of Dimension Reduction Methods

Archaeometric characterisation of ancient pottery belonging to the archaeological site of Novalesa Abbey (Piedmont, Italy) by ICP-MS and spectroscopic techniques coupled to multivariate statistical tools

This work presents the archaeometric characterisation of a group of ancient pottery remains discovered during the restoring of the Novalesa Abbey (Susa Valley, Turin, Italy) performed in 2000. The characterisation focuses on the achievement of information about provenance and production process of the samples. Firstly, the data concerning the multi-element characterisation of the samples by inductively coupled plasma-mass spectrometry (ICP-MS) were analysed by chemometric tools (principal component analysis and cluster analysis) in order to obtain information about their similarity and clustering. These information, integrated with the results of micro-Raman spectroscopy analysis of the inclusions shed light on differences in the production process of the samples.

Full three-dimensional imaging via ground penetrating radar: assessment in controlled conditions and on field for archaeological prospecting

This paper deals with an advanced microwave tomographic approach capable of providing full 3D images of buried targets from scattered field data gathered by means of Ground Penetrating Radar (GPR) systems. The approach is based on an approximated model of the scattering phenomenon and it is capable of accounting for the vectorial nature of the interactions occurring between electromagnetic waves and probed materials. Moreover, the Truncated Singular Value Decomposition inversion scheme is exploited to solve the involved linear inverse scattering problem in a stable and accurate way. The advantages offered by the full 3D inversion algorithm with respect to a commonly adopted strategy, which produces 3D images by interpolating 2D reconstructions, are assessed against GPR data gathered in laboratory controlled conditions. Moreover, to provide an example of the full 3D imaging capabilities in on field conditions, we report on a GPR measurement campaign carried out at Grotte dell’Angelo, Pertosa, (SA), Southern Italy, one of the most famous sites of the Cilento and Vallo di Diano geopark.

Nonlinear Dimensionality Reduction for Data Visualization: An Unsupervised Fuzzy Rule-Based Approach

In this article, we propose a general framework for the unsupervised fuzzy rule-based dimensionality reduction primarily for data visualization. This framework has the following important characteristics relevant to the dimensionality reduction for visualization: preserves neighborhood relationships; effectively handles data on nonlinear manifolds; capable of projecting out-of-sample test points; can reject test points, when it is appropriate; and interpretable to a reasonable extent. We use the first-order Takagi–Sugeno model. Typically, fuzzy rules are either provided by experts or extracted using an input–output training set. Here, neither the output data nor experts are available. This makes the problem challenging. We estimate the rule parameters minimizing a suitable objective function that preserves the interpoint geodesic distances (distances over the manifold) as Euclidean distances on the projected space. In this context, we propose a new variant of the geodesic c-means clustering algorithm. The proposed method is tested on several synthetic and real-world datasets and compared with the results of six state-of-the-art data visualization methods. The proposed method is the only method that performs equally well on all the datasets tried. Our method is found to be robust to the initial conditions. The predictability of the method is validated by suitable experiments. We also assess the ability of our method to reject test points when it should. The scalability issue of the scheme is also discussed. Due to the general nature of the framework, we can use different objective functions to obtain projections satisfying different goals. To the best of our knowledge, this is the first attempt to manifold learning using unsupervised fuzzy rule-based modeling.

Identifying Ancient Ceramics Using Laser-Induced Breakdown Spectroscopy Combined with a Back Propagation Neural Network

This study investigated the rapid identification of ceramics via laser-induced breakdown spectroscopy (LIBS) to realize the identification of ancient ceramics from different regions. Ceramics from different regions may have large differences in their elemental composition. Thus, using LIBS technology for ceramic identification is feasible. The spectral intensities of 11 common elements, namely, Si, Al, Fe, Ca, Mg, Ti, Mn, Na, K, Sr, and Ba, in ceramics were selected as classification indices. Principal component analysis (PCA) and kernel principal component analysis (KPCA) combined with the back propagation (BP) neural network were used to identify ceramics. Furthermore, the effects of the PCA and KPCA data processing methods were compared. Finally, this work aimed to select a suitable method for obtaining spectral data on ceramics identified by LIBS through experiments. Results revealed that LIBS technology could aid the routine, rapid, and on-site analysis of archeological objects to rapidly identify or screen various types of objects.

Encyclopedia of Machine Learning

A Tutorial on Principal Component Analysis

Principal component analysis (PCA) is a mainstay of modern data analysis - a black box that is widely used but (sometimes) poorly understood. The goal of this paper is to dispel the magic behind this black box. This manuscript focuses on building a solid intuition for how and why principal component analysis works. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind PCA. This tutorial does not shy away from explaining the ideas informally, nor does it shy away from the mathematics. The hope is that by addressing both aspects, readers of all levels will be able to gain a better understanding of PCA as well as the when, the how and the why of applying this technique.

Remote Sensing, Archaeological, and Geophysical Data to Study the Terramare Settlements: The Case Study of Fondo Paviani (Northern Italy)

During the Middle and Recent Bronze Age, the Po Plain and, more broadly Northern Italy were populated by the so-called “Terramare”, embanked settlements, surrounded by a moat. The buried remains of these archaeological settlements are characterized by the presence of a system of palaeo-environments and a consequent natural gradient in soil moisture content. These differences in the soil are often firstly detectable on the surface during the seasonal variations, with aerial, satellite, and Laser Imaging Detection and Ranging (LIDAR) images, without any information on the lateral and in-depth extension of the related buried structures. The variation in the moisture content of soils is directly related to their differences in electrical conductivity. Electrical resistivity tomography (ERT) and frequency domain electromagnetic (FDEM), also known as electromagnetic induction (EMI) measurements, provide non-direct measurements of electrical conductivity in the soils, helping in the reconstruction of the geometry of different buried structures. This study presents the results of the multidisciplinary approach adopted to the study of the Terramare settlement of Fondo Paviani in Northern Italy. Remote sensing and archaeological data, collected over about 10 years, combined with more recent ERT and FDEM measurements, contributed to the analysis of this particular, not yet wholly investigated, archaeological site. The results obtained by the integrated multidisciplinary study here adopted, provide new useful, interesting information for the archaeologists also suggesting future strategies for new studies still to be conducted around this important settlement.

The interplay between adjacent Adige and Po alluvial systems and deltas in the late Holocene (Northern Italy)

The alluvial plain behind the southern Venice Lagoon, in northern Italy, is characterised by the presence of a complex network of alluvial ridges formed by the aggradation of channel deposits and natural levees. They are the geomorphological products of the interaction between the Adige and Po during the late Holocene. New geomorphological and stratigraphic data provided a detailed reconstruction of the evolution of the Po-Adige alluvial plain and allowed for defining the relation with the migration of delta lobes in the southern Venice Lagoon and near the Adriatic coast. Three cross sections (total of 28 manual boreholes) were obtained using a hand auger through two alluvial ridges built by the Adige River: the first two on the Conselve ridge (in the locations of Conselve and Santa Margherita, respectively) and the third across the modern Adige alluvial ridge. Radiocarbon dating was carried out on 9 peat samples. These data, coupled with a DTM, allowed the identification of a confluence between the Adige and Po during the Bronze Age. They also enabled the identification of the timing of major Adige and Po avulsions and the correlation with the 5.0-1.5 ka cal. BP development of a delta-mouth sedimentary system recognised in the southern Venice Lagoon. The results show that major avulsive events in the upstream tracts of the Po and Adige Rivers forced the migration of delta lobes. The delta system in the southern Venice Lagoon was fed by the Saline-Cona Po branch from about 4-3 ka cal. BP. This implies that the Po delta extended as far as 30 km north of the present-day river position. In this time frame, the Adige did not directly reach the sea, as it was a tributary of the Po at Agna. The northernmost lobe of the Po delta was thus fed by the sedimentary input of both the Po and Adige. The subsequent deactivation of the Saline-Cona Po branch through avulsion just upstream of Rovigo, around 3 ka cal. BP, led to a southwards shift of the Po delta system. The Adige still flowed through the Conselve ridge and kept its mouth in the same area. From 3 ka cal. BP to Roman times, the river constructed its own delta in the southern Venice Lagoon, prograding on the previous northern Po delta lobe. The Adige avulsion at Bonavigo during the early Middle Ages led to the abandonment of the Montagnana-Este-Conselve course long the southern foot of the Euganean Hills. As a consequence, the delta in the southern Venice Lagoon was also definitely abandoned and the Adige River started to construct another delta in its present-day position, about 15 km further to the south.

Geoarcheologia delle Valli Grandi Veronesi e Bonifica Padana (Rovigo): uno scenario evolutivo

Il ripostiglio di San Basilio (Ariano Polesine-Rovigo): denari e quinari di età repubblicana

San Basilio (Ariano Polesine). Scavo nell'area di un insediamento romano, luglio 1977

Spina. Scavi nell'abitato della città etrusca 2007-2009

Il volume illustra i recenti scavi archeologici, tra il 2007 e il 2009 dell’importante città di Spina. Essa rappresentava una porta dell’Etruria padana verso la Grecia e l’Oriente, una città cosmopolita tra Po e Adriatico, punto d’incontro di uomini e merci. Tutto questo è stata Spina, la città portuale etrusca sorta negli ultimi decenni del VI secolo a.C. alla confluenza tra un fiume appenninico e un ramo del Po, a breve distanza dal mare. Vengono qui pubblicati i materiali rinvenuti durante le campagne di scavo, dalle ceramiche (alle quali viene dato grande risalto), ai metalli, alla coroplastica, oltre alle analisi di archeozoologia e archeobotanica per la ricostruzione del paesaggio antico.

Spina città liquida. Gli scavi 1977-1981 nell'abitato e i materiali tardo-arcaici e classici

Le ultime fasi del sito di San Cassiano di Crespino e le trasformazioni nell’entroterra di Adria

San Basilio di Ariano Polesine,

Testimonianze di traffici commerciali in età romana nel delta padano attraverso alcune classi di materiali dello scavo di San Basilio di Ariano Polesine (Rovigo)

Adria. L’abitato etrusco

Conference papers on the Etruscan city of Spina (Ferrara, Italy)

Ricerche sull' evoluzione del delta padano

San Basilio (Ariano Polesine): seconda campagna di scavo. Agosto 1978

La villa rustica di San Basilio

L’abitato arcaico di San Basilio

L’Eridano, il Po e i suoi rami. Un paesaggio culturale e le sue trasformazioni tra fonti letterarie e testimonianze archeologiche

Il complesso di San Cassiano di Crespino (RO): aspetti culturali e rapporti con il territorio

Le Balone. Insediamento etrusco presso un ramo del Po, Catalogo della mostra,

L’entroterra di Adria: conoscenze archeologiche e paleoambientali

L’abitato arcaico di San Basilio di Ariano Polesine

Gli scavi archeologici nel podere Forzello a San Basilio di Ariano Polesine

Ceramica di uso comune da S. Basilio di Ariano nel Polesine (Rovigo),

L’insediamento di S. Basilio di Ariano Polesine

La città e il sacro in Etruria padana: riti di fondazione e assetti urbanistico istituzionali

Carta Geomorfologica della Pianura Padana a scala 1:250.000.

Questa carta rappresenta gli aspetti geomorfologici della Pianura Padana studiati nell’ambito di una ricerca che, a partire dalla metà degli anni ‘80, ha coinvolto ricercatori di tutte le Università del Nord Italia. Tale ricerca ha prodotto due carte ciascuna divisa in 3 settori: la Carta Geomorfologica della Pianura Padana, e la Carta Altimetrica e dei movimenti verticali del suolo della Pianura Padana, entrambe alIa scala di 1:250.000, composte di tre fogli ciascuna. Le carte sono state pubblicate insieme nel 1997, e presentate in occasione della Forth International Conference on Geomorphology che si tenne a Bologna dal 28 Agosto al 3 Settembre di quell'anno. Dal punto di vista geografico, la pubblicazione di queste carte ha colmato un'evidente lacuna nella conoscenza di una delle più importanti «regioni naturali» d'Italia e d'Europa.

Sigillata nord-italica da San Basilio di Ariano nel Polesine (Ro- vigo)

L’insediamento antico di San Basilio di Ariano nel Polesine

Tel Achziv : les fouilles d’une antique cité phénicienne

Le royaume de Tyr dans la seconde moitié du IVe siècle av. J.C.

Problèmes et entraves de l’historien au Proche-Orient : l’exemple du Liban

Bibliography of Digital Archaeology

7067 resources