Abstract:Random Forests is an important ensemble learning method and it is widely used in data classification and nonparametric regression. In this paper, we review three main theoretical issues of random forests, i.e., the convergence theorem, the generalization error bound and the out-of-bag estimation. In the end, we present an improved Random Forests algorithm, which uses a feature weighting sampling method to sample a subset of features at each node in growing trees. The new method is suitable to solve classification problems of very high dimensional data.
Abstract:Maximal clique analysis is an important method of stock market graph analysis. Traditional maximal clique enumeration algorithms enumerate all maximal cliques in the graph, which cannot support efficient stock market graph analysis. In this paper, we propose interactive visualization methods for large-scale stock market graphs. According to user’s interested stocks, we provide functions to enumerate all maximal cliques related to those stocks quickly, and to view their combination relations as well as other related stocks. Our interactive visualization methods are very useful to stock market graph analysis. Moreover, traditional maximal clique enumeration algorithms cannot be applied to support those functions. Due to the need of enumerating all maximal cliques related to specific nodes or edges, we propose a new maximal clique enumeration algorithm containing specific nodes or edges. We use real a dataset to verify the superior performance of our algorithm.
Abstract:Co-clustering algorithms cluster a data matrix into row clusters and column clusters simultaneously. In this paper, we propose TLWCC, a two-level subspace weighting co-clustering algorithm, and introduces the idea of a two-level subspace weighting method into the co-clustering process. TLWCC adds the first level of weights on co-clusters, and then adds the second level of weights on rows and columns. The three types of weights (co-cluster, row and column weights) are computed in the clustering progress, according to the distances between co-clusters (or rows, columns) and their centers. The larger the distance is, the stronger noise it implies, so a smaller weight is given and vice verse. Thus, by giving small weights to noise, TLWCC filters out the noise and improves the co-clustering result. We propose an iterative algorithm to optimize the model. We carried out four experiments to learn more about TLWCC. The first experiment investigated the properties of three types of weights. The second experiment studied how the clustering result was influenced by the parameters. The third experiment compared the clustering performance of TLWCC with other three algorithms. The fourth experiment examined the computational efficiency of our proposed algorithm.
Abstract:In order to help the hearing loss children, we obtained hearing loss children’s fallible pronunciation texts and the confusing pronunciation text pairs form a good deal of hearing loss children’s audio pronunciation data. We designed a data-driven 3D talking head articulatory animation system, it was driven by the articulatory movements which were collected from a device called Electro-magnetic articulography (EMA) AG500, the system simulated Chinese articulation realistically. In that way, the hearing loss children can observe the speaker’s lips and tongue’s motions during the speaker pronouncing, which could help the hearing loss children train pronunciation and correct the fallible pronunciations. Finally, a perception test was applied to evaluate the system’s performance. The results showed that the 3D talking head system can animate both internal and external articulatory motions effectively. A modified CM model based synthesis method was used to generate the articulatory movements. The root mean square between the real articulatory movements and synthetic articulatory movements was used to measure the synthesis method, and an average value of RMS is 1.25 mm.
Abstract:Due to the difficult of text-to-text semantic similarity feature extraction in spontaneous speech evaluation, this paper presents WordNet based Lesk algorithm to calculate the semantic similarity between words, defines the semantic similarity algorithm between word and text based on the semantic similarity between words, and proposes a complete set of wordnet based text-to-text semantic similarity feature extraction methods. Experiment extracts text-to-text semantic similarity feature between student’s answers and the standard answers with this algorithm and analyzes the correlation between the feature and the teacher rating. Experimental results show that the algorithm can effectively characterize the text-to-text semantic similarity between the students’ answers and the standard answer.
Abstract:This paper proposes a multi-ocular localization and tracking system, which consists of three parts, near-infra red illuminating circuits, tracking tools with passive retro-reflective markers and multiple-camera array. The localization algorithm covers feature detection, point correspondence, 3D point reconstruction and target recognition. The validation and performance of the proposed system and algorithms have been proved by the simulated experiments and real-data experiments.
Abstract:An analog IC applied to ECG monitoring devices has been presented in this paper. The system consists of instrumentation amplifier with driven-right-leg circuit, 2nd active low pass filter, the second amplify stage, high power supply rejection ratio (PSRR) low dropout voltage regulator (LDO) and leadoff monitoring circuit. This chip has included all necessary functions for industrial applications. And it has high common mode rejection and power supply rejection performance. The design is fabricated with SMIC 0.18 μm CMOS process. And the measured results have demonstrated this chip’s functionalities. The gain of the mid-band is 51 dB. And the CMRR and PSRR achieve 75 dB and 90 dB. The IC consumes 190 μA currents with 2.9~5.5 V supply voltage.
Abstract:Micro-CT is a new three dimensional imaging tool based on x-ray imaging mechanism and with ultrahigh spatial resolution. It can be used to image all kinds of samples or live small animalsunder non-destructive condition. In this paper, a micro-CT prototype system with high spatial resolution was developed. From the projective images and reconstructed cross-sectional images of a small insect sample, the phase-contrast effect with the edge-enhancement trait can be observed clearly. The cross-sectional images also show that the developed micro-CT system has the detail detectability down to 12 micrometers.
Abstract:Introduction: Magnetic resonance elastography (MRE) is a non-invasive technique to quantitatively assess the mechanical properties of biological tissue. Among multiple actuation methods, pneumatic driver is popularly used due to the advantage of good MR compatibility. In this study we set up a prototype pneumatic driver and investigate its performance under MRE with stimulations of various amplitude-frequency pairs.Materials and Methods: Load-free performance of the driver was tested using ultrasound with stimulations of various amplitude-frequency pairs. Performance with load was tested using MRE with both tissue mimicking phantoms and leg muscle. Results: Frequency and amplitude of the external stimulation, mechanical properties of biological tissue, and the length and the diameter of the transmission tube were all found to affect the elastogram reconstruction. Conclusions: A MR compatible actuating driver generating reliable mechanical stimulation is required to guarantee stable propagating shear waves in tissue. Optimal actuating parameters should be properly determined in the practice of MRE for specific tissue.
Abstract:Protein translation is a remarkably complex and crucial process of life. Translation speed varies along mRNA to coordinate the co-translational protein folding, and such variations may have drastic effects on the final conformation of the protein encoded. In this paper, we choose HisA protein with TIM beta-alpha barrel fold from two different species to investigate the factors that may be involved in modulating translation process for the fomation of structure symmetry. The association between symmetry in protein structure and several intragenic features is explored, including local codon usage bias along codon sequence, local charge distribution of the encoded protein sequence, local GC content distribution along the nucleotide sequence. Our results show that for HisA proteins from two different species, symmetry in structure is correlated with codon usage bias, charge of the residues and GC content.
Abstract:This paper introduces the improvement project for draft tube of Unit 2 in the third power station of Gutianxi Hydropower Plant, it includes: the partial reconstruction of draft tube, renewal of draft tube manhole door, the base of draft tube Grouting.