Rui Haohui , Nie Zedong , Zeng Guang , Qin Wenjian
Online: March 03,2025 DOI: 10.12146/j.issn.2095-3135.20241224001
Abstract:Image denoising methods based on deep learning have effectively solved the problems of cumbersome parameter tuning and complex noise modeling in traditional denoising methods. However, the model training of supervised learning relies heavily on pairs of clean and noisy images, which limits the wide application of such models. Unsupervised learning denoising models only require single noisy images for training, but the existing unsupervised denoising methods still have the problem that it is difficult to balance network training efficiency and denoising performance.In this paper, we propose an efficient image denoising method, which improves the efficiency of denoising model training. Specifically, this method proposes a deep neighbor downsampler, which is used to obtain similar image pairs for training the noise model from the same noisy image. Our proposed sampler method not only meets the requirements that the pixels of the image pairs are adjacent and the appearances are similar, but also the deep neighbor downsampling discards some redundant information and avoids heavy dependence on assumptions about the noise distribution.Finally, we verify the effectiveness of our method through synthetic experiments with different noise distributions in the sRGB space and real image experiments. The experimental results confirm that the sampling strategy we proposed effectively overcomes the balance problem between training efficiency and denoising performance.
Online: March 03,2025 DOI: 10.12146/j.issn.2095-3135.20241129002
Abstract:Currently, with the exponential growth of data on the internet, the complexity of big data processing systems has also increased dramatically. To adapt to changes in factors such as cluster resources, datasets, and applications, big data processing systems provide adjustable configuration parameters tailored to different application scenarios. Among these systems, Spark is one of the most popular and contains over 200 configuration parameters for controlling parallelism, I/O behavior, memory settings, and compression. Incorrect configuration of these parameters often leads to severe performance degradation and stability issues. However, both ordinary users and expert administrators face significant challenges in understanding and tuning these settings for optimal performance, resulting in substantial human and time costs. In the tuning process, selecting unreasonable parameter ranges can increase time costs by fivefold, or even worse, cause operational failures in the cluster and terminate system operation—an incalculable loss for large-scale clusters serving customers.
zhangyuxin , xieyaoqin , sundeyu , gaoyuhua , cuiming , qinwenjian
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20241129003
Abstract:Cervical cancer is one of the leading causes of cancer-related deaths among women globally, especially in developing countries, where its high incidence and late-stage diagnosis pose significant challenges to treatment. Accurate segmentation of critical organs, such as the colon and rectum, is crucial for the radiotherapy treatment of cervical cancer. Precise segmentation of these organs helps physicians with dose estimation, ensuring the accuracy and effectiveness of radiotherapy plans, and minimizing radiation damage to healthy tissues. However, the automatic segmentation of tubular structures, such as the colon and rectum, remains challenging, especially due to factors like bowel folds and motion artifacts, which lead to poor segmentation results. The complex morphology of the bowel and its low contrast with surrounding tissues make it difficult to identify boundaries, thereby affecting segmentation accuracy. This paper proposes a method for segmenting tubular critical organs in cervical cancer, combining centerline and distance map information to enhance the network"s understanding of anatomical structures and improve the segmentation accuracy of tumors and critical organs. Based on the traditional U-Net architecture, we incorporate centerline and distance map learning into the network, which helps the model better recognize the topological structure of tubular critical organs and their spatial relationships within the body, ultimately improving segmentation precision and optimizing radiotherapy dose distribution. Through experimental evaluation on a cervical cancer dataset, performance analysis was conducted using metrics such as Dice Similarity Coefficient (Dice), Intersection over Union (IoU), Recall, and 95th percentile Hausdorff Distance (HD95). The experimental results show that our method outperforms the traditional U-Net model in all these metrics, with a Dice score of 69.75%, IoU of 54.17%, and Recall of 73.57%, which represent improvements of 30.88%, 29.93%, and 36.91%, respectively, compared to the original 3D-UNet. HD95 is 9.46, which is a reduction of 8.59 compared to the original 3D-UNet. These results demonstrate that accurate segmentation of critical organs not only improves tumor recognition in cervical cancer but also provides important support for radiotherapy dose estimation, optimizes treatment planning, and enhances the safety and effectiveness of treatment.
JIANG Biao , ZHENG Jianglong , Huang Xiaoxin , Li Zhifeng , Li Linwei , HUANG Yifan
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20241010001
Abstract:Electromagnetic pulse sound source (Boomer) is a commonly used explosion sound source in marine seismic exploration, and the deep-sea application of such explosion sound source needs to solve cavitation suppression problem. In this paper, a deep-sea boomer source based on pressure compensation balance is proposed. A boomer transducer with a maximum working pressure of 20MPa is developed and tested in a high-pressure anechoic tank. Through the analysis of the hydrophone outputs under different energy and pressure levels, it can be seen that an air sac with the initial pressure of 0.5MPa can effectively balance the internal and external pressure of the transducer, solve the problem of cavitation suppression, and realize the excitation of broadband pulse sound waves. The repeatability of the acoustic wave is very good, and the minimum correlation coefficient is to 0.986. With the increase of working pressure from 0.5MPa to 20MPa, the main change in acoustic characteristics is the amplitude attenuation (204.6dB to 194.2 dB) and width compression (182μs to 88μs), and the main frequency (2.3kHz as the center) slightly shifted to high frequency. Compared with the hydrophone output in the process of pressure rising and downing in the high-pressure anechoic tank, it can be seen that the repeatability of the acoustic wave is better. The higher the pressure, the better the waveform consistency, indicating that the boomer transducer based on pressure compensation balance has a more stable performance under high pressure environment.
Daiwei , Zhanghaoxuan , Chenfangxu , Pengwei
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20241012001
Abstract:Cancer is a genetically related disease with multiple subtypes, each exhibiting significant differences in genetics, phenotype, and treatment response. Accurate classification of cancer subtypes is critical for personalized treatment, as it helps improve therapeutic outcomes. However, cancer subtype classification methods based on patient gene expression data often struggle to effectively distinguish rare subtypes in the presence of imbalanced samples. To address this issue, a cancer subtype classification method called MFP-VAE (Meta-learning Few-shot Prototype learning VAE) is proposed, focusing on handling datasets with imbalanced samples. This method improves the sampling strategy to ensure balanced consideration of different subtypes in meta-learning tasks. The model employs a variational autoencoder for feature extraction and classifies samples by calculating the distance between the samples and the subtype prototypes. Experimental results show that MFP-VAE outperforms existing methods on two public cancer datasets, significantly improving classification accuracy, especially under imbalanced sample conditions. Furthermore, survival analysis reveals that the distinguished cancer subtypes exhibit significant differences in clinical characteristics, providing meaningful clinical insights.
Li Yisheng , Xu Yongjie , Wang Shuqiang , Wang Yishan
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20241127001
Abstract:With the rapid development of deep learning technology, autism screening based on neural signals such as Electroencephalography (EEG) is gradually emerging as a novel diagnostic approach. However, due to the complexity of EEG data acquisition, especially for children, insufficient data often poses a challenge. Data augmentation methods are commonly used to address the scarcity of real-world data, with Generative Adversarial Networks (GANs) being a frequently applied technique. However, due to the limited scale and diversity of data, current augmentation methods have yet to achieve optimal classification performance. This study introduces an improved conditional diffusion model to enhance both raw EEG signals and their corresponding functional connectivity temporal graphs. Experimental results demonstrate that this method significantly improves autism classification performance, achieving maximum classification accuracies of 84.38% and 79.01% for resting-state and task-state data, respectively. These findings validate the effectiveness of data augmentation based on the conditional diffusion model in enhancing autism screening outcomes.
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20241201003
Abstract:Existing indoor three-dimensional (3D) object detection is able to detect a limited number of object categories, thus limiting the application on intelligent robotics. Open vocabulary object detection is able to detect all objects of interest in a given scene without defining object categories, thus solving the shortcomings of indoor 3D object detection. At the same time, the large language model with prior knowledge can significantly improve the performance of visual tasks. However, existing researches on open-vocabulary indoor 3D object detection only focuses on object information and ignores contextual information. The input data for indoor 3D object detection is mainly point cloud, which suffers from sparsity and noise problems. Relying only on the object point cloud can negatively affect the 3D detection results. Contextual information contains scene information, which can complement the object information to promote the recognition on object category. For this reason, this paper proposes an open vocabulary 3D object detection algorithm based on contextual information assistance. The algorithm integrates contextual information and object information through a large language model, and then performs chain-of-thought reasoning. The proposed algorithm is validated on SUN RGB-D and ScanNetV2 datasets, and the experimental results show the effectiveness of the proposed algorithm.
ZHENG Jianglong , JIANG Biao , Li Zhifeng , Huang Xiaoxin , Li Linwei , HUANG Yifan
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20241205001
Abstract:The high-pressure anechoic tank is an important experimental testing platform for the development of deep-sea transducers, sensors, and other acoustic instruments and equipment. In this paper, background noise and acoustic field fluctuations at different frequencies were measured for the homemade 20MPa high-pressure anechoic tank. The echo interference level under fixed measurement position and distance conditions was calculated, and the echo interference curve was drawn. The time-frequency characteristics of signals under typical low-frequency and high-frequency conditions were analysed. The measurement results of background noise show that although the background noise inside the tank is relatively high and has characteristic peaks in the frequency range of 10-12kHz, it allows for measurement experiments with sufficient signal-to-noise ratio conditions. Meanwhile, the time-domain waveform results of sound field fluctuations measured in different frequencies show that the signal amplitude rapidly decays after a transmission width of 2ms, and the higher the frequency, the faster the attenuation, indicating that the sound absorption cone inside the tank has a good sound absorption effect. The calculation results of echo interference level show that most frequency points above 10kHz do not exceed ± 1dB. The designed fixed measurement position meets the requirements of free field testing, especially the echo interference of frequency points such as 20kHz, 28kHz, and 34kHz does not exceed ±0.5dB, which meets the requirements of precision measurement.
Liu Gaocheng , Tong Jiabo , Yang Shilin , Wang Qiuying , Tang Xinyu , Liu Chang , Liu Jia
Online: February 13,2025 DOI: 10.12146/j.issn.2095-3135.20250118001
Abstract:The reconstruction of cerebral blood flow velocity (CBFV) is essential for the long-term assessment of cerebrovascular function. To this end, this study proposes a multivariate time-series model based on a Transformer encoder to reconstruct CBFV signals. The model utilizes time-series signals of arterial blood pressure (ABP) and CO2 to achieve accurate CBFV reconstruction. A long short-term memory (LSTM) module is introduced into the model to address the limitation of dispersed global attention in the attention mechanism, thereby enhancing the processing of local details. Additionally, a mixed loss function is employed to control local waveform errors, improving reconstruction accuracy. Furthermore, a transfer learning strategy is designed based on the correlation between ABP and electrocardiogram (ECG) signals to alleviate the impact of data scarcity on the reconstruction task. Experimental results on the cerebrovascular regulation dataset of diabetic patients from Beth Israel Deaconess Medical Center demonstrate that the proposed model outperforms existing regression models and deep learning models in CBFV reconstruction tasks. The results show a Pearson correlation coefficient of 0.518, a dynamic time warping distance of 17.879, and a mutual information value of 0.343 between the reconstructed and true values. Additionally, the model can reconstruct 200 data points within 0.04 seconds.
liujinqing , sunrenyun , zhaoling , zhangguohao , hezihao
Online: February 07,2025 DOI: 10.12146/j.issn.2095-3135.20241204002
Abstract:Based on the path planning method that ensures tangency between arcs and straight lines, potential collision conditions during parking are analyzed. Moreover, parking path planning for oblique spaces is conducted, with obstacles and parking space boundaries evaluated in detail to ensure the reasonableness of the planning. Additionally, a collision constraint model for the parking path is constructed to account for the dynamic characteristics and geometric constraints of vehicle motion. Distance equations for critical collision conditions are developed to determine safe distances and effective areas during parking maneuvers. Simulation experiments conducted under various working conditions demonstrate that collision avoidance is effectively achieved through the path planning method, which aligns with the vehicle"s traveling direction and accommodates varying lane widths. Ultimately, the validity of the planned paths is verified, and the safety and adaptability of these paths are assessed to ensure secure parki
Mobile website