Outside of the major technological shifts and digitisation of data in recentyears, secondary data analysis has also become more widely used due to thechallenges of undertaking empirical research. An issue some nursingscientists face is recruiting populations of patients or carers that aredifficult to reach due to a myriad of social, cultural, economic andpolitical reasons. These may include refugee and migrant groups, those whoexperience domestic and sexual violence, homelessness and many more (Biederman & Forlan,2016). Research fatigue in over-researched groups that mayinclude some cancer patients, certain indigenous communities, nursingstudents, and others can also be avoided by adopting secondary analysis(Clark,2008). Hence, utilising existing datasets related to participants ofinterest can offer an alternative way to examine some issues while removingrespondent burden (Ziebland & Hunt, 2014). Szabo & Strang (1997) suggestit can also help reduce researcher bias and provide some objectivity, as theresearcher may not have been immersed in the original study design or datacollection. Equally, accessing different professional groups from cliniciansto policy makers may prove difficult at times. For instance, emerging globalhealth crises such as COVID-19 pose barriers to recruiting these types ofparticipants and carrying out primary data collection, outside of researchfocused on addressing the immediate health crisis (Nicol et al., 2020). Therefore,tapping into existing datasets and interpreting them to address researchquestions can be beneficial, enabling an area of nursing science to moveforward.
In terms of the secondary analysis of quantitative data, traditional approachesutilising descriptive and inferential statistics on an array of datasets arecommon. Secondary sources of quantitative data may include national censusconducted by government, local or regional datasets held by public bodies,or questionnaires and surveys undertaken by researchers at a university orother type of national or international institution (Dale et al., (2008). For example,Oh et al.(2016) mined the Korea Youth Risk Behaviour Web-based Survey todetermine whether satisfaction with sleep was linked to stress inadolescents with atopic disease, while Jacoby et al. (2017) reused datafrom a longitudinal cohort study of psychological outcomes from minor injuryto examine how this relates to recovery and disability. Digital archivesheld by libraries, museums or other social and cultural agencies could alsobe useful sources of quantitative data, with some providing an extensivecatalogue that is searchable online. The International Federation of DataOrganisations (IFDO,2020) and the Consortium of European Social Science DataArchives (CESSDA,2020) may be helpful in identifying national archives forsecondary analysis. However, Dale et al. (2008) warn ofpotential problems with the secondary analysis of quantitative data assurveys and other measurement tools may have been constructed and theirreliability and validity determined in specific ways. Equally the sample ofparticipants, their characteristics and response rates may pose issues whenmodelling for correlation or causation. Hence, a critical eye should be castto appreciate the strengths, limitations and biases inherent in aquantitative dataset before reusing it.
Statistics and Data Analysis for Nursing Research (2nd Edition) downloads torrent
Both the CutPrimers-based pipeline and Cutadapt-based pipeline have similar analysis steps (Fig 2), but a major difference between the two is where the adapter removal step occurs and whether the researchers knows the primer sequences or not. In the Cutadapt-based pipeline, the forward and reverse reads are run separately through Cutadapt and DADA2, since the forward and reverse V-specific primers generated unique sequence amplicons. In the CutPrimers-based pipeline, all V regions and runs were processed separately, however, since the Metagenomics PP plugin generates split V region specific reads combining both forward and reverse reads together, the combined V region subsets were run through DADA2 together. Nevertheless, the principle of running each sequencing run separately through DADA2 still remains, so sequencing error can be independently modeled for each run [11]. In both pipelines, the separate feature tables and sequences are merged after DADA2, to generate one combined feature table for taxonomy assignment and downstream analyses. The Cutadapt-based pipeline can also be used as a guide for single amplicon data from Ion Torrent reads, as the main requirement for this pipeline is that the sequences for the primers are known. Similarly, these microbiome analysis pipelines are also relevant for Illumina sequences, but researchers with sequences generated from Illumina platforms should also reference the freely available tutorials on the QIIME2 user guide aimed specifically for Illumina or paired-end sequences.
The variance in individual taxa accuracy metrics demonstrated that bacterial taxonomic classification agreement with expected mock RA varied by V region and occasionally reference database. Actinomyces and Clostridium were largely underrepresented in the even mock samples with the exception of Clostridium, demonstrating a more accurate RA representation in the V2-V4 regions when the Greengenes reference database was used. Conversely, Bacteroides, for example, had RA values in the feature table that were larger than was expected in regions V2-V6-7, but had mean accuracy values close to zero in regions V8-9 across all reference databases in both the even and staggered mocks. As specific bacterial taxa are more predominant in different microbiome habitats, such as the oral and gut microbiome [35], researchers should consider the research microbiome site sampled in addition to other factors if one individual V region is emphasized for downstream analysis. The accuracy measure (O/E ratio) used in this work to reports how close the observed genus relative abundance value was to the expected value; however, we are unable to report how well these ratios compare to other literature since there is a scarcity of benchmarking literature targeting the multi-amplicon kits currently available. Importantly, future benchmarking work using multi-amplicon kits and Ion Torrent sequences is needed to confirm our results. Furthermore, this variability in taxonomic annotation across V regions supports our view that it is imperative to plan the sequencing strategy in line with the research questions, the dominant taxa in the microbiome being targeted, and the available resources [4]. When V-region informed planning is not possible due to lack of previous data on the taxa of bacteria expected or a need for taxonomic resolution at the level of species or strain, full length 16S rRNA gene sequencing with long-read sequencing technology or shotgun metagenomics sequencing may be desirable [10]. However, when this is not feasible, using the benchmarking-associated resources presented in this manuscript to target two or more V regions in a bias-informed manner would be a valuable alternative in order to merge taxonomy tables and manage this variation in annotation efficiency.
Although we believe this manuscript and the pre-processing pipelines presented make a valuable contribution to the literature, there are some limitations in this research that we want to address. First, we present two pre-processing pipelines aimed at amplicon deconvolution and ASV generation, up to the point of feature or taxonomy table generation. Feature table combination strategies with multi-amplicon kits are outside the scope of this manuscript, but many strategies have been employed to synthesize additive information from multiple feature tables in downstream microbiome analysis [20,36]. Additionally, the benchmarking results presenting in this manuscript are only comparable to reads generated from the Ion 16S kit and Ion Torrent sequencing data. This work focused on a finite number of mock samples and as such, discrepancies in taxa names were accounted for manually. For future work employing a larger number of samples or taxa, we suggest using the NCBI taxonomy IDs for taxa merging across many tables. Nevertheless, we believe this workflow and pre-processing pipeline will be useful for analyzing data in the QIIME2 environment and will allow for flexibility in pre-processing and downstream analysis of microbiome sequencing data.
This open resources textbook contains 10 Units that describe and explain the main concepts in statistical analysis of psychological data. In addition to conceptual descriptions and explanations of the basic analyses for descriptive statistics, this textbook also explains how to conduct those analyses with common statistical software (Excel) and open-source free software (R).
People with a passion for science and improving the health of others are well-suited for a role in clinical research. You'll need to be a person who is good at observation as well as someone who is analytical since the field requires plenty of data analysis. Critical thinking and decision-making skills are a must, as are good communication skills, both written and verbal. You must be willing to provide constructive feedback to subjects and colleagues, as well as motivate your subjects. You'll also need to document and record findings. For this reason, computer skills are important as well. Those who work in the clinical research field should be organized, good administrators, multi-taskers, and people who can think quickly on their feet.
In addition, APA Ethical Principles specify that "after research results are published, psychologists do not withhold the data on which their conclusions are based from other competent professionals who seek to verify the substantive claims through reanalysis and who intend to use such data only for that purpose, provided that the confidentiality of the participants can be protected and unless legal rights concerning proprietary data preclude their release" (Standard 8.14).
2ff7e9595c
Comments