Length-biased Sampling Research Papers - Academia.edu (original) (raw)

2025

In the period autumn 2007 -spring 2008 fish sampling was realised at three sampling stations in the estuarine system of the transboundary river Strymon, using a bag seine net (2 mm mesh size knot to knot). A total of 5066 specimens,... more

In the period autumn 2007 -spring 2008 fish sampling was realised at three sampling stations in the estuarine system of the transboundary river Strymon, using a bag seine net (2 mm mesh size knot to knot). A total of 5066 specimens, weighting 4745.98 g were caught. The specimens belonged to 19 species, 5 from which were caught only in Amphipoli lagoon. Atherina boyeri dominated the catches in terms of number (40.91%) and weight (76%). Differences in the catch species composition were found between the two sampling periods.

2025, Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

While developing data-centric programs, users often run (portions of) their programs over real data, to see how they behave and what the output looks like. Doing so makes it easier to formulate, understand and compose programs correctly,... more

While developing data-centric programs, users often run (portions of) their programs over real data, to see how they behave and what the output looks like. Doing so makes it easier to formulate, understand and compose programs correctly, compared with examination of program logic alone. For large input data sets, these experimental runs can be time-consuming and inefficient. Unfortunately, sampling the input data does not always work well, because selective operations such as filter and join can lead to empty results over sampled inputs, and unless certain indexes are present there is no way to generate biased samples efficiently. Consequently new methods are needed for generating example input data for data-centric programs. We focus on an important category of data-centric programs, dataflow programs, which are best illustrated by displaying the series of intermediate data tables that occur between each pair of operations. We introduce and study the problem of generating example intermediate data for dataflow programs, in a manner that illustrates the semantics of the operators while keeping the example data small. We identify two major obstacles that impede naive approaches, namely (1) highly selective operators and (2) noninvertible operators, and offer techniques for dealing with these obstacles. Our techniques perform well on real dataflow programs used at Yahoo! for web analytics.

2025

The moorland or paramo is a threatened ecosystem. The indiscriminate advance of the agricultural frontier is producing the loss of ecosystem services, especially water service. This research estimated the willingness to pay (WTP) of the... more

The moorland or paramo is a threatened ecosystem. The indiscriminate advance of the agricultural frontier is producing the loss of ecosystem services, especially water service. This research estimated the willingness to pay (WTP) of the water users corresponding to the Municipality of Riobamba for the conservation of the water service in the Micro-basin of the Chimborazo River (MCRCH). Four hundred and six surveys were applied by means of the double limit dichotomous contingent valuation method, using a maximum likelihood model in the Stata software. Four models were developed: simple limit, simple limit with other explanatory variables, double limit, and double limit with other explanatory variables, the latter being statistically more significant. As a result, it was determined that the WTP is USD 0.84 per month to conserve the water service of the MCRCH, value that increases if the home ownership variable is included in USD 0.04. The problem of climate change increases in USD 0.24, while the variable level of education decreases the WTP by USD 0.04.

2025

In this article, median test for complete data is generalized to multiple Type-II censored data for the two-sample location problem. Since, the scores generating function associated with generalized median (GM) test statistic has a finite... more

In this article, median test for complete data is generalized to multiple Type-II censored data for the two-sample location problem. Since, the scores generating function associated with generalized median (GM) test statistic has a finite number of jump discontinuities, a modified version of the theorem is used to establish its asymptotic normality under fixed location alternatives. The effect of censoring on the asymptotic efficiency (ARE) is studied. It is found that as long as the median of the data is available any particular case of multiple Type-II censoring has no effect on the ARE of the GM test. There is a gain in efficiency due to middle censoring when the underlying distribution is normal, logistic and extreme value distributions. This suggests that in this case, it is possible to improve the efficiency of GM test by discarding suitable portions of the data.

2025, Advances in Complex Systems

We introduce an automatic configuration mechanism that generates the most relevant information to be presented to limited attention users of information-rich media. It also guarantees to maximize their total expected utility from the... more

We introduce an automatic configuration mechanism that generates the most relevant information to be presented to limited attention users of information-rich media. It also guarantees to maximize their total expected utility from the information they receive. A computationally efficient algorithm is used to assign an index value to each information item, which then determines whether or not a given item appears in the top list presented to users at a given time.

2024, Journal of Physics: Conference Series

A new breakthrough in jet propulsion technology since the invention of the jet engine is achieved. The first critical tests for future air-breathing magneto-plasma propulsion systems have been successfully completed. In this regard, it is... more

A new breakthrough in jet propulsion technology since the invention of the jet engine is achieved. The first critical tests for future air-breathing magneto-plasma propulsion systems have been successfully completed. In this regard, it is also the first time that a pinching dense plasma focus discharge could be ignited at one atmosphere and driven in pulse mode using very fast, nanosecond electrostatic excitations to induce self-organized plasma channels for ignition of the propulsive main discharge. Depending on the capacitor voltage (200-600 V) the energy input at one atmosphere varies from 52-320 J/pulse corresponding to impulse bits from 1.2-8.0 mNs. Such a new pulsed plasma propulsion system driven with one thousand pulses per second would already have thrust-to-area ratios (50-150 kN/m²) of modern jet engines. An array of thrusters could enable future aircrafts and airships to start from ground and reach altitudes up to 50km and beyond. The needed high power could be provided by future compact plasma fusion reactors already in development by aerospace companies. The magneto-plasma compressor itself was originally developed by Russian scientists as plasma fusion device and was later miniaturized for supersonic flow control applications. So the first breakthrough is based on a spin-off plasma fusion technology.

2024, Applied Economics Letters

We propose a new estimation technique to deal with missing response variables in the context of a nested multinomial logit model. Survey data often have a significant number of incomplete or missing responses. If such data are... more

We propose a new estimation technique to deal with missing response variables in the context of a nested multinomial logit model. Survey data often have a significant number of incomplete or missing responses. If such data are systematically missing (i.e., not missing at random) and if such observations are deleted from the analysis, biased sample selection results. We apply our new method to the empirical analysis of determining job loss status.

2024

Uncertainty pervades most aspects of life. From selecting a new technology to choosing a career, decision makers often ignore the outcomes of their decisions. In the last decade a new paradigm has emerged in behavioral decision research... more

Uncertainty pervades most aspects of life. From selecting a new technology to choosing a career, decision makers often ignore the outcomes of their decisions. In the last decade a new paradigm has emerged in behavioral decision research in which decisions are "experienced" rather than "described", as in standard decision theory. The dominant finding from studies using the experience-based paradigm is that decisions from experience exhibit "black swan effect", i.e. the tendency to neglect rare events. Under prospect theory, this results in an experience-description gap. We show that several tentative conclusions can be drawn from our interdisciplinary examination of the putative experience-description gap in decision under uncertainty. Several insights are discussed. First, while the major source of under-weighting of rare events may be sampling error, it is argued that a robust experience-description gap remains when these factors are not at play. Second, the residual experience-description gap is not only about experience per se, but also about the way in which information concerning the probability distribution over possible outcomes is learned. Additional econometric and empirical work might be required to fully flech out these tentative conclusions. However, there was a consensus that an initially polemical literature turns out to be constructive in drawing researcher towards greater rapprochements.

2024, The Annals of Statistics

The multiplicative censoring model introduced in Vardi [Biometrika 76 (1989) 751-761] is an incomplete data problem whereby two independent samples from the lifetime distribution G, X m = (X 1 ,. .. , X m) and Z n = (Z 1 ,. .. , Z n), are... more

The multiplicative censoring model introduced in Vardi [Biometrika 76 (1989) 751-761] is an incomplete data problem whereby two independent samples from the lifetime distribution G, X m = (X 1 ,. .. , X m) and Z n = (Z 1 ,. .. , Z n), are observed subject to a form of coarsening. Specifically, sample X m is fully observed while Y n = (Y 1 ,. .. , Y n) is observed instead of Z n , where Y i = U i Z i and (U 1 ,. .. , U n) is an independent sample from the standard uniform distribution. Vardi [Biometrika 76 (1989) 751-761] showed that this model unifies several important statistical problems, such as the deconvolution of an exponential random variable, estimation under a decreasing density constraint and an estimation problem in renewal processes. In this paper, we establish the large-sample properties of kernel density estimators under the multiplicative censoring model. We first construct a strong approximation for the process √ k(Ĝ − G), whereĜ is a solution of the nonparametric score equation based on (X m , Y n), and k = m + n is the total sample size. Using this strong approximation and a result on the global modulus of continuity, we establish conditions for the strong uniform consistency of kernel density estimators. We also make use of this strong approximation to study the weak convergence and integrated squared error properties of these estimators. We conclude by extending our results to the setting of length-biased sampling.

2024, Pakistan Journal of Statistics

In this paper, we consider a nonparametric estimator of the Lorenz curve when data are showing some kind of dependence. The uniform strong convergence rate of the estimator under strong mixing hypothesis is obtained. Strong Gaussian... more

In this paper, we consider a nonparametric estimator of the Lorenz curve when data are showing some kind of dependence. The uniform strong convergence rate of the estimator under strong mixing hypothesis is obtained. Strong Gaussian approximation for the associated Lorenz process are established under appropriate assumptions. A law of the iterated logarithm for the Lorenz process is also derived.

2024

We present a hybrid path planning algorithm for rigid bodies translating and rotating in a 3D workspace. Our approach generates a Voronoi roadmap in the workspace and combines it with "bridges" computed by a randomized path planner with... more

We present a hybrid path planning algorithm for rigid bodies translating and rotating in a 3D workspace. Our approach generates a Voronoi roadmap in the workspace and combines it with "bridges" computed by a randomized path planner with Voronoi-biased sampling. The Voronoi roadmap is computed from a discrete approximation to the generalized Voronoi diagram (GVD) of the workspace, which is generated using graphics hardware. By this use of the GVD, portions of the path can be generated without random sampling, substantially reducing the number of random samples needed for the full query. The planner has been implemented and tested on a number of benchmarks. Some preliminary comparisons with a randomized motion planner indicate that our planner performs more than an order of magnitude faster in several challenging scenarios.

2024, The Astrophysical Journal

Angular and spatial correlations are measured for K-band-selected galaxies, 248 having redshifts, 54 with z > 1, in two patches of combined area ≃ 27 arcmin 2. The angular correlation for K ≤ 21.5 mag is ω(θ) ≃ (θ/1.4 ± 0.19 ′′ e ±0.1)... more

Angular and spatial correlations are measured for K-band-selected galaxies, 248 having redshifts, 54 with z > 1, in two patches of combined area ≃ 27 arcmin 2. The angular correlation for K ≤ 21.5 mag is ω(θ) ≃ (θ/1.4 ± 0.19 ′′ e ±0.1) −0.8. From the redshift sample we find that the real-space correlation, calculated with q 0 = 0.1, of M K ≤ −23.5 mag galaxies (k-corrected) is ξ(r) = (r/2.9e ±0.12 h −1 Mpc) −1.8 at a mean z ≃ 0.34, (r/2.0e ±0.15 h −1 Mpc) −1.8 at z ≃ 0.62, (r/1.4e ±0.15 h −1 Mpc) −1.8 at z ≃ 0.97, and (r/1.0e ±0.2 h −1 Mpc) −1.8 at z ≃ 1.39, the last being a formal upper limit for a bluebiased sample. In general, these are more correlated than optically selected samples in the same redshift ranges. Over the interval 0.3 ≤ z ≤ 0.9 galaxies with red rest-frame colors, (U − K) 0 > 2 AB mag, have ξ(r) ≃ (r/2.4e ±0.14 h −1 Mpc) −1.8 whereas bluer galaxies, which have a mean B of 23.7 mag and mean [O ii] equivalent width W eq = 41Å, are very weakly correlated, with ξ(r) ≃ (r/0.9e ±0.22 h −1 Mpc) −1.8. For our measured growth rate of clustering, this blue population, if non-merging, can grow only into a low-redshift population less luminous than 0.4L *. The cross-correlation of low-and highluminosity galaxies at z ≃ 0.6 appears to have an excess in the correlation amplitude within 100 h −1 kpc. The slow redshift evolution is consistent with these galaxies tracing the mass clustering in low density, Ω ≃ 0.2, relatively unbiased, σ 8 ≃ 0.8, universe, but cannot yet exclude other possibilities.

2024, European Conference on Artificial Intelligence

Many real-world applications of AI require both probability and first-order logic to deal with uncertainty and structural complexity. Logical AI has focused mainly on handling complexity, and statis- tical AI on handling uncertainty.... more

Many real-world applications of AI require both probability and first-order logic to deal with uncertainty and structural complexity. Logical AI has focused mainly on handling complexity, and statis- tical AI on handling uncertainty. Markov Logic Networks (MLNs) are a powerful representation that combine Markov Networks (MNs) and first-order logic by attaching weights to first-order formulas and viewing these as templates

2024, Thailand Statistician

This paper considers stratified inverse sampling with four variations from each stratum, namely inverse random sampling with replacement, inverse random sampling without replacement, inverse probability proportional to size (PPS) sampling... more

This paper considers stratified inverse sampling with four variations from each stratum, namely inverse random sampling with replacement, inverse random sampling without replacement, inverse probability proportional to size (PPS) sampling with replacement and inverse PPS sampling without replacement. Unbiased estimators of the mean of a study variable in the whole population and the number of units in a class of interest together with their unbiased variance estimators are given. Estimation of the mean per unit in the class of interest is also presented. A simulation study is employed to study the properties of these sampling designs and the results indicate that inverse sampling without replacement is more efficient than inverse sampling with replacement. Inverse PPS sampling gives higher efficiencies of the estimates than inverse random sampling when correlation coefficient between auxiliary and study variables is large. When the number of sampled units in a class of interest increases, the variance and mean squared error of the estimate decreases.

2024, Stat

The pervasive use of prevalent cohort studies on disease duration increasingly calls for an appropriate methodology to account for the biases that invariably accompany samples formed by such data. It is well known, for example, that... more

The pervasive use of prevalent cohort studies on disease duration increasingly calls for an appropriate methodology to account for the biases that invariably accompany samples formed by such data. It is well known, for example, that subjects with shorter lifetime are less likely to be present in such studies. Moreover, certain covariate values could be preferentially selected into the sample, being linked to the long-term survivors. The existing methodology for estimating the propensity score using data collected on prevalent cases requires the correct conditional survival/hazard function given the treatment and covariates. This requirement can be alleviated if the disease under study has stationary incidence, the so-called stationarity assumption. We propose a non-parametric adjustment technique based on a weighted estimating equation for estimating the propensity score, which does not require modeling the conditional survival/hazard function when the stationarity assumption holds. The estimator's large-sample properties are established, and its small-sample behavior is studied via simulation. The estimated propensity score is utilized to estimate the survival curves.

2024

This paper investigates the application of an inductive logic programming system, allied with Markov Logic Networks (MLNs), to the task of learning event models from large video datasets. A learning from interpretations setting is used to... more

This paper investigates the application of an inductive logic programming system, allied with Markov Logic Networks (MLNs), to the task of learning event models from large video datasets. A learning from interpretations setting is used to learn event models efficiently, these models define the structure of a MLN. The network parameters are obtained from discriminative learning and probabilistic inference is used to query the MLN for event recognition.

2024

2. Overall Objectives 2.1. General objectives Recent evolutions in distributed computing significantly increased the degree of uncertainty inherent to any distributed system and led to a scale shift that traditional approaches can no... more

2. Overall Objectives 2.1. General objectives Recent evolutions in distributed computing significantly increased the degree of uncertainty inherent to any distributed system and led to a scale shift that traditional approaches can no longer accommodate. The key to scalability in this context lies into fully decentralized and self-organizing solutions. The objective of the ASAP project team is to provide a set of abstractions and algorithms to build serverless large-scale distributed applications involving a large set of volatile, geographically distant, potentially mobile and/or resource-limited computing entities.

2024, Journal of Vertebrate Paleontology

In this article, hypotheses about the origin, evolution and dispersal of Megantereon are reviewed using the fossil specimens included in previous comparative studies as well as the remains identified in the late Pliocene site of Fonelas... more

In this article, hypotheses about the origin, evolution and dispersal of Megantereon are reviewed using the fossil specimens included in previous comparative studies as well as the remains identified in the late Pliocene site of Fonelas (Spain) and the early Pleistocene localities of Lantian, Lingyi, Longdan, Renzidong (China), and Untermassfeld (Germany). The validity of the two species proposed by Martínez-Navarro and Palmqvist (1995), Megantereon cultridens and M. whitei, is evaluated using tooth measurements and multivariate statistical methods. The hypothesis of sexual dimorphism as an explanation for the morphological variability of Megantereon is tested with a large sample of sexed individuals of Panthera pardus and Panthera leo. Results obtained indicate similar or even smaller differences in tooth dimensions between M. cultridens and M. whitei than between sexes in both leopards and lions, except in the case of the lower fourth premolar. However, in spite of a substantial overlap between both Megantereon species in the size of the upper canine, this tooth reverses the differences found for other tooth measurements, because M. cultridens shows larger cheek teeth on average than M. whitei but smaller sabers. This is confirmed by principal components and discriminant analyses, which reveal that sexual dimorphism in leopards and lions is a matter of tooth size and not of relative proportions and argues against the interpretation of M. cultridens and M. whitei as the sexes (males and females, respectively) of a single species. These results indicate that M. cultridens and M. whitei are valid species, because the differences in tooth measurements exceed those expected from sexual dimorphism and do not reveal the effects of biased sampling. Finally, an analysis of jaw anatomy reveals biomechanical differences between both Megantereon species, related to the relative efficiency of the biting muscles at the level of the lower carnassial.

2024, Electronic Colloquium on Computational Complexity

Nisan and B. Velickovic. Approximations of general independent distributions. In Proc. of 24th ACM Symposium on Theory of Computing, pp. 10-16, 1992. KK94] D. Karger and D. Koller. (De)randomized construction of small sample spaces in NC.... more

Nisan and B. Velickovic. Approximations of general independent distributions. In Proc. of 24th ACM Symposium on Theory of Computing, pp. 10-16, 1992. KK94] D. Karger and D. Koller. (De)randomized construction of small sample spaces in NC. In Proc. of 35th IEEE Symposium on Foundations of Computer Science, pp. 252-263, 1994. KM94] H. Karlo and Y. Mansour. On construction of k-wise independent random variables. In Proc. of the 26th Annual ACM Symposium on Theory of Computing, 1994. KSTT92] J. K obler, U. Sch oning, S. Toda, and J. Tor an. Turing machines with few accepting computations and low sets for PP. Journal of Computer and System Sciences 44(2): 272{286, 1992. KW84] R. Karp and A. Wigderson. A fast parallel algorithm for the maximal independent set problem. In Proc. of the 16th Annual ACM Symposium on Theory of Computing, 1984. LN86] R. Lidl and H. Niederreiter, Introduction to Finite Fields and their applications, Cambridge University Press, 1986. Lub85] M. Luby. A simple parallel algorithm for the maximal independent set problem.

2024

Nisan and B. Velickovic. Approximations of general independent distributions. In Proc. of 24th ACM Symposium on Theory of Computing, pp. 10-16, 1992. KK94] D. Karger and D. Koller. (De)randomized construction of small sample spaces in NC.... more

Nisan and B. Velickovic. Approximations of general independent distributions. In Proc. of 24th ACM Symposium on Theory of Computing, pp. 10-16, 1992. KK94] D. Karger and D. Koller. (De)randomized construction of small sample spaces in NC. In Proc. of 35th IEEE Symposium on Foundations of Computer Science, pp. 252-263, 1994. KM94] H. Karlo and Y. Mansour. On construction of k-wise independent random variables. In Proc. of the 26th Annual ACM Symposium on Theory of Computing, 1994. KSTT92] J. K obler, U. Sch oning, S. Toda, and J. Tor an. Turing machines with few accepting computations and low sets for PP. Journal of Computer and System Sciences 44(2): 272{286, 1992. KW84] R. Karp and A. Wigderson. A fast parallel algorithm for the maximal independent set problem. In Proc. of the 16th Annual ACM Symposium on Theory of Computing, 1984. LN86] R. Lidl and H. Niederreiter, Introduction to Finite Fields and their applications, Cambridge University Press, 1986. Lub85] M. Luby. A simple parallel algorithm for the maximal independent set problem.

2024

A sample of death notices from the New Zealand Herald was used as the basis of a Data Analysis assignment. This note explores some interesting statistical aspects of these death notices, using common data analysis techniques, and... more

A sample of death notices from the New Zealand Herald was used as the basis of a Data Analysis assignment. This note explores some interesting statistical aspects of these death notices, using common data analysis techniques, and illustrates how they can be used as a resource for teaching. In particular they provide a clear example of biased sampling, a concept that is usually hard to quantify.

2024, Bernoulli

Consider informative selection of a sample from a finite population. Responses are realized as independent and identically distributed (i.i.d.) random variables with a probability density function (p.d.f.) f , referred to as the... more

Consider informative selection of a sample from a finite population. Responses are realized as independent and identically distributed (i.i.d.) random variables with a probability density function (p.d.f.) f , referred to as the superpopulation model. The selection is informative in the sense that the sample responses, given that they were selected, are not i.i.d. f . In general, the informative selection mechanism may induce dependence among the selected observations. The impact of such dependence on the empirical cumulative distribution function (c.d.f.) is studied. An asymptotic framework and weak conditions on the informative selection mechanism are developed under which the (unweighted) empirical c.d.f. converges uniformly, in L2 and almost surely, to a weighted version of the superpopulation c.d.f. This yields an analogue of the Glivenko-Cantelli theorem. A series of examples, motivated by real problems in surveys and other observational studies, shows that the conditions are verifiable for specified designs.

2024, Annals of Statistics

Right censored survival data collected on a cohort of prevalent cases with constant incidence are length-biased, and may be used to estimate the lengthbiased (i.e., prevalent-case) survival function. When the incidence rate is constant,... more

Right censored survival data collected on a cohort of prevalent cases with constant incidence are length-biased, and may be used to estimate the lengthbiased (i.e., prevalent-case) survival function. When the incidence rate is constant, so-called stationarity of the incidence, it is more efficient to use this structure for unconditional statistical inference than to carry out an analysis by conditioning on the observed truncation times. It is well known that, due to the informative censoring for prevalent cohort data, the Kaplan-Meier estimator is not the unconditional NPMLE of the length-biased survival function and the asymptotic properties of the NPMLE do not follow from any known result. We present here a detailed derivation of the asymptotic properties of the NPMLE of the length-biased survival function from right censored prevalent cohort survival data with follow-up. In particular, we show that the NPMLE is uniformly strongly consistent, converges weakly to a Gaussian process, and is asymptotically efficient. One important spin-off from these results is that they yield the asymptotic properties of the NPMLE of the incident-case survival function [see Asgharian, M'Lan and Wolfson J. Amer. Statist. Assoc. 97 (2002) 201-209], which is often of prime interest in a prevalent cohort study. Our results generalize those given by Vardi and Zhang [Ann. Statist. 20 (1992) 1022-1039] under multiplicative censoring, which we show arises as a degenerate case in a prevalent cohort setting.

2024

Wavelet linear density estimation for associated stratified size-biased sample

2024

Exploring the effect of personalization on different queries can improve the ranking result. There is a need for a mechanism to estimate the potential for personalization for queries. Previous methods to estimate the potential for... more

Exploring the effect of personalization on different queries can improve the ranking result. There is a need for a mechanism to estimate the potential for personalization for queries. Previous methods to estimate the potential for personalization such as click entropy and topic entropy are based on the prior clicked document for query or query history. They have limitations like unavailability of the prior clicked data for new/unseen queries or queries without history. To alleviate the problem, we provide a solution for the queries regardless of query history. In this paper, we present a new metric using the topic distribution of user documents in the topical user profile, to estimate the potential for personalization for all queries. Using the proposed metric, we can achieve more performance for queries with history and solve the cold start problem of queries without history. To improve personalized search, we provide a personalization ranking model by combining personalized and non-personalized topic models where the proposed metric is used to estimate personalization. The result reveals that the personalization ranking model using the proposed metric improves the Mean Reciprocal Rank and the Normalized Discounted Cumulative Gain by 5% and 4% respectively.

2024, Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Exploring the effect of personalization on different queries can improve the ranking result. There is a need for a mechanism to estimate the potential for personalization for queries. Previous methods to estimate the potential for... more

Exploring the effect of personalization on different queries can improve the ranking result. There is a need for a mechanism to estimate the potential for personalization for queries. Previous methods to estimate the potential for personalization such as click entropy and topic entropy are based on the prior clicked document for query or query history. They have limitations like unavailability of the prior clicked data for new/unseen queries or queries without history. To alleviate the problem, we provide a solution for the queries regardless of query history. In this paper, we present a new metric using the topic distribution of user documents in the topical user profile, to estimate the potential for personalization for all queries. Using the proposed metric, we can achieve more performance for queries with history and solve the cold start problem of queries without history. To improve personalized search, we provide a personalization ranking model by combining personalized and non-personalized topic models where the proposed metric is used to estimate personalization. The result reveals that the personalization ranking model using the proposed metric improves the Mean Reciprocal Rank and the Normalized Discounted Cumulative Gain by 5% and 4% respectively.

2024, Goldschmidt Abstracts

2024, Proceedings of the National Academy of Sciences

An investigation of the statistical properties of the native conformations of proteins, observed from crystal structures, is reported. Protein conformations were analyzed in terms of a bond vector correlation function and molecular... more

An investigation of the statistical properties of the native conformations of proteins, observed from crystal structures, is reported. Protein conformations were analyzed in terms of a bond vector correlation function and molecular volume. It was observed that, while the volume of a protein structure varies nearly linearly with the number of residues, the bond vector correlation function exhibits a universal feature for all sizes of proteins. To interpret the nature of the bond vector correlation function of native protein structures quantitatively, Monte Carlo simulations of realistic polypeptide chains of specific but arbitrary amino acid sequence were carried out. The molecule was constrained in an ellipsoidal volume determined by its chain length, and conformations with unacceptable nonbonded contacts between different amino acid residues were excluded. The interactions within a terminally blocked single residue, which correlate two nearest-neighbor peptide groups in a chain, we...

2024, Scientific Annals of the Danube Delta Institute

The study of ichthyofauna from Rosu-Puiu lake-complex was undertaken in May 2005, using two complementary methods of sampling: electric fishing for shoreline and gillnet fishing for deep water zone. Thepaper presents the actual situation... more

The study of ichthyofauna from Rosu-Puiu lake-complex was undertaken in May 2005, using two complementary methods of sampling: electric fishing for shoreline and gillnet fishing for deep water zone. Thepaper presents the actual situation of the fish communities from the Rosu-Puiu lake-complex, part of "marine delta" from Danube delta. In this lake-complex there were catched 29 species of fish (5 exotic species and 24 native species. The most abundant fish species are Alburnus alburnus (bleak), Abramis bjoerkna (silver bream),Clupeonella cultriventris (Black Sea sprat), Carassius gibelio (Prussian carp) and Rutilus rutilus (roach) with slight differences for those two complementary methods of sampling. The biodiversity index and equitability indices were calculated per lake and methods.The lake-complex has a stable ecosystem, with shoreline (1.155 Shanon-Wiener index and 0.437 equitability indices) less stables than open water (1.878 Shanon-Wiener index and 0.576 equitabili...

2024, Astronomy & Astrophysics

With FORS 1 at the VLT we have tried for the first time to measure the magnetic field variation over the pulsation cycle in six roAp stars to begin the study of how the magnetic field and pulsation interact. For the star HD 101065, which... more

With FORS 1 at the VLT we have tried for the first time to measure the magnetic field variation over the pulsation cycle in six roAp stars to begin the study of how the magnetic field and pulsation interact. For the star HD 101065, which has one of the highest photometric pulsation amplitudes of any roAp star, we found a signal at the known photometric pulsation frequency at the 3σ level in one data set; however this could not be confirmed by later observations. A preliminary simple calculation of the expected magnetic variations over the pulsation cycle suggests that they are of the same order as our current noise levels, leading us to expect that further observations with increased S/N have a good chance of achieving an unequivocal detection.

2024, Astronomy & Astrophysics

Magnetic fields play a key role in the pulsations of rapidly oscillating Ap (roAp) stars since they are a necessary ingredient of all pulsation excitation mechanisms proposed so far. This implies that the proper understanding of the... more

Magnetic fields play a key role in the pulsations of rapidly oscillating Ap (roAp) stars since they are a necessary ingredient of all pulsation excitation mechanisms proposed so far. This implies that the proper understanding of the seismological behaviour of the roAp stars requires knowledge of their magnetic fields. However, the magnetic fields of the roAp stars are not well studied. Here we present new results of measurements of the mean longitudinal field of 14 roAp stars obtained from low resolution spectropolarimetry with FORS 1 at the VLT.

2024, Astronomy and Astrophysics Supplement Series

We present new measurements of the mean magnetic field modulus of a sample of Ap stars with spectral lines resolved into magnetically split components. We report the discovery of 16 new stars having this property. This brings the total... more

We present new measurements of the mean magnetic field modulus of a sample of Ap stars with spectral lines resolved into magnetically split components. We report the discovery of 16 new stars having this property. This brings the total number of such stars known to 42. We have performed more than 750 measurements of the mean field modulus of 40 of these 42 stars, between May 1988 and August 1995. The best of them have an estimated accuracy of 25 − 30 G. The availability of such a large number of measurements allows us to discuss for the first time the distribution of the field modulus intensities. A most intriguing result is the apparent existence of a sharp cutoff at the low end of this distribution, since no star with a field modulus (averaged over the rotation period) smaller than 2.8 kG has been found in this study. For more than one third of the studied stars, enough field determinations well distributed throughout the stellar rotation cycle have been achieved to allow us to characterize at least to some extent the variations of the field modulus. These variations are often significantly anharmonic, and it is not unusual for their extrema not to coincide in

2024

Let E = (P) be a dominated experiment with an open Euclidean parameter space and densities (f). The experiment is said to be continuously L 2 {diierentiable if the family (p f) is continuously L 2 {diierentiable. The following assertions... more

Let E = (P) be a dominated experiment with an open Euclidean parameter space and densities (f). The experiment is said to be continuously L 2 {diierentiable if the family (p f) is continuously L 2 {diierentiable. The following assertions are proved: (1) The experiment E is continuously L 2 {diierentiable ii the family (f) is continuously L 1 {diierentiable and Fisher's information function is continuous. (2) Suppose that the experiment E is continuously L 2 {diierentiable and the experiment F is less informative than E. Then F is continuously L 2 {diierentiable, too.

2024, South African Journal of Economics

This paper examines the impact of …nancial deepening on long run economic growth in South Africa over the period 1954-92. Two models are developed using the Johansen VECM structure. The …rst model investigates whether the …nancial system... more

This paper examines the impact of …nancial deepening on long run economic growth in South Africa over the period 1954-92. Two models are developed using the Johansen VECM structure. The …rst model investigates whether the …nancial system has a direct or indirect e¤ect on per capital output via the investment rate. The second model attempts to investigate the possibility of feedback e¤ects between the …nancial and real sectors. We …nd that both dimensions of the …nancial system-…nancial intermediation and securities-a¤ect economic growth in both models. Furthermore, both models reveal that the …nancial system has an indirect e¤ect on GDP via the investment rate. Feedback e¤ects are also found to exist between the real and …nancial sectors. One interpretation of the evidence is that credit rationing is prevalent in South Africa with …rms extensively relying on internal …nance to meet their …nancing requirements.

2024, arXiv (Cornell University)

Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks,... more

Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs and other content. In this paper we analyze the structure of collaborative tagging systems as well as their dynamical aspects. Specifically, we discovered regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given url. We also present a dynamical model of collaborative tagging that predicts these stable patterns and relates them to imitation and shared knowledge.

2024

This paper will report on the first full investigations on the level of occurrence and qualitative/quantitative profiles of microplastics, MP, (1-5mm) in a number of saudy beaches in Malta, (Central MeditelTanean). Five popular beaches... more

This paper will report on the first full investigations on the level of occurrence and qualitative/quantitative profiles of microplastics, MP, (1-5mm) in a number of saudy beaches in Malta, (Central MeditelTanean). Five popular beaches were investigated, including Ghadira Bay, Golden Bay, St. George's Bay, Ghajn Tuftieha Bay and Pretty Bay. Samples for all bays were collected in August 2015, while further detailed sampling was calTied out for the last two bays in summer and in winter of2016. Sampling protocol was adopted from Galgani er aI., (2013). For all locations, samples were collected from strandline and then at 10m up shore at surfuce (top Scm). For Gnajn 664 ME.DCOAST 17 Tuffieha and Pretty Bay, samples were also collected from a 40cm depth. MP were extracted from sand through wet sieving, and then sorted and characterized according to size, colour, shape, and polymer type. Several parameters including degree of sea exposure and sand properties were reeorded. Full beach profiles for all locations are available. Identification of polymer type was carried out by means of qualitative density tests. In summer of 2015, the highest levels of MP were reported ill Pretty Bay at 10.81 itemsllO,OOcm o of wet sand with the lowest being in Ghajn Tuffiet1., at 0.72 items/lOOO em'. In general, levels ofMP in the dlY season were found to be higher than those recorded in the wet season (winter), Higher MP concentration was recorded at 10 m up-shore as opposed to the strandline. Furthermore, surface sands comained a higher concentration ofrvIP when corr:.pared with the sEbsurface sediments: though this was not was no! the case at Pretty Bay in winter. These results are interpreted in terms of different beach profiles, beach dynamics, sand propel1ies and potential sources ofl'vlP. The local level of occurrence of !viP seems to be lower when compared to other European locations studied so far. The fact that in this study, MP below Irnm were not included in the data, as well as the lack of rivers in the Maltese isimlds, regular beach clean ups and other factors may explain this. Data on the characterisation of !viP found are provided. For example, polyethylene and polypropylene were the most common polymers recorded at Gnajn Tuftleha Bay whereas polyethylene and paint fragments were the most common 'MP recorded at Pretty Bay, This investigation is a contribution to our knowledge of how levels of !vIP in sandy beaches may be affected by sand propel1ies and dynamics, beach profiles and other factors.

2024

No data from comprehensive, multi-seasonal macrofaunal studies for the two secluded beaches at Xatt LAhmar (Gozo) has been published to date. This paper reports on the deployment of pitfall trap constellations (for nocturnal... more

No data from comprehensive, multi-seasonal macrofaunal studies for the two secluded beaches at Xatt LAhmar (Gozo) has been published to date. This paper reports on the deployment of pitfall trap constellations (for nocturnal surface-active macrofauna) and hand-towed nets in shallow water (for infralittoral macrofauna emerging at night in the water column) for eight consecutive seasons. Despite their small size, the beaches at ix-Xatt l-Ahmar harbour relatively high macrofaunal individual abundances, high fractions of psammophilic (i.e. sand-specific species) species and high fractions of rare or endangered species. On the basis of the results reported in this study, it is suggested that the current conservation regime afforded to the ix-Xatt L-Ahmar environs (mainly on cultural and historical grounds) is revised and extended in order to safeguard this ecologically important site.

2024, Croatian Journal of Fisheries

2024, Physical Review B

With the experimental results obtained with a special nanopillar structure and the calculations based on the one-dimensional diffusion equations, the respective contributions to the spin torque from spin accumulation and local spin... more

With the experimental results obtained with a special nanopillar structure and the calculations based on the one-dimensional diffusion equations, the respective contributions to the spin torque from spin accumulation and local spin current are quantitatively deduced. The results for a typical nanopillar structure show that the spin accumulation contributes to about 90% of the necessary torque for the antiparallel to parallel switching, while the parallel to antiparallel switching is totally dominated by the local spin current. Both the variations in spin accumulation and local spin current should be considered in any effort to reduce the critical switching current density. However it seems more effective to increase the local spin current in most cases.

2024, Biology Letters

Sexual size dimorphism (SSD) is a common morphological trait in ungulates, with polygyny considered the leading driver of larger male body mass and weapon size. However, not all polygynous species exhibit SSD, while molecular evidence has... more

Sexual size dimorphism (SSD) is a common morphological trait in ungulates, with polygyny considered the leading driver of larger male body mass and weapon size. However, not all polygynous species exhibit SSD, while molecular evidence has revealed a more complex relationship between paternity and mating system than originally predicted. SSD is, therefore, likely to be shaped by a range of social, ecological and physiological factors. We present the first definitive analysis of SSD in the common hippopotamus (Hippopotamus amphibius) using a unique morphological dataset collected from 2994 aged individuals. The results confirm that hippos exhibit SSD, but the mean body mass differed by only 5% between the sexes, which is rather limited compared with many other polygynous ungulates. However, jaw and canine mass are significantly greater in males than females (44% and 81% heavier, respectively), highlighting the considerable selection pressure for acquiring larger weapons. A predominantly aquatic lifestyle coupled with the physiological limitations of their foregut fermenting morphology likely restricts body size differences between the sexes. Indeed, hippos appear to be a rare example among ungulates whereby sexual selection favours increased weapon size over body mass, underlining the important role that species-specific ecology and physiology have in shaping SSD.

2024, Lecture Notes in Computer Science

Processes involving change over time, uncertainty, and rich relational structure are common in the real world, but no general algorithms exist for learning models of them. In this paper we show how Markov logic networks (MLNs), a recently... more

Processes involving change over time, uncertainty, and rich relational structure are common in the real world, but no general algorithms exist for learning models of them. In this paper we show how Markov logic networks (MLNs), a recently developed approach to combining logic and probability, can be applied to time-changing domains. We then show how existing algorithms for parameter and structure learning in MLNs can be extended to this setting. We apply this approach in two domains: modeling the spread of research topics in scientific communities, and modeling faults in factory assembly processes. Our experiments show that it greatly outperforms purely logical (ILP) and purely probabilistic (DBN) learners.

2024, Journal of the American Academy of Psychiatry and the Law

Popular media and the lay public have long expressed concerns about the association between violent video games and violent behavior. The current scientific literature exploring this connection focuses primarily on the relationship... more

Popular media and the lay public have long expressed concerns about the association between violent video games and violent behavior. The current scientific literature exploring this connection focuses primarily on the relationship between violent video games and aggression in healthy populations. We are unaware of prior publications exploring the effect of such games on aggression in institutional settings or with forensic populations. Here we examine whether state psychiatric institutions, particularly forensic hospitals, have set policies to govern the use of violent video games for patients under their care. We present data from a national survey of such institutions in the United States, with some anecdotal international data included. The results demonstrate that hospital policies, when they exist, are inconsistent in their approaches to the use of violent video games. We argue that hospitals should devise policies that acknowledge the limited evidence in this area and that optimally balance the relevant stakeholders' interests. We propose guiding principles that balance these competing interests for institutions to consider when developing such policies. Finally, we advocate for further research regarding the safety and potential therapeutic effects of video games in forensic settings so that an evidence-based approach can be initiated future.

2024, Journal of Research in Personality

2024, Aquatic Sciences and Engineering

In this study, a total of 1283 samples of five fish species belonging to two families, Cyprinidae and Leuciscidae, were collected from the Lower Sakarya River between June 2017 and May 2018 in order to determine some growth parameters.The... more

In this study, a total of 1283 samples of five fish species belonging to two families, Cyprinidae and Leuciscidae, were collected from the Lower Sakarya River between June 2017 and May 2018 in order to determine some growth parameters.The samples were collected monthly with trammel net, fykenets, and electro shocker. The age of the fish was determined from the scales. The von Bertalanffy's growth model was calculated Lt = 92.18(1-e-0.054(t+0.040)) for A. brama, Lt = 69.40(1-e-0.040(t+0.030)) for B. bjoerkna, Lt = 51.09(1-e-0.114(t+0.024)) for C. gibelio, Lt = 48.11(1-e-0.088(t+0.023)) for R. rutilus and Lt = 41.74(1-e-0.104(t+0.035)) for V. vimba. The phi-prime growth performance index (Φ') value was computed as 2.628, 2.268, 2.474, 2.307 and 2.260 for A. brama, B. bjoerkna, C. gibelio, R. rutilus and V. vimba, respectively. This study provides basic information on some growth parameters of five fish species living in the Lower Sakarya River. The results of this study are useful for fishery managements and stock assessment in the Sakarya River.

2024

dans les systèmes distribués à grande échelle avec une distribution de pairs hétérogène Résumé : Les systèmes distribués à grande échelle rassemblent des milliers de noeuds répartis dans le monde. Ces systèmes doivent offrir de bonnes... more

dans les systèmes distribués à grande échelle avec une distribution de pairs hétérogène Résumé : Les systèmes distribués à grande échelle rassemblent des milliers de noeuds répartis dans le monde. Ces systèmes doivent offrir de bonnes performances de routage indépendamment de leur taille et malgré le taux élevé de connexion/déconnexion. Pour cela, le système doit ajouter des raccourcis à son graphe logique (overlay en anglais). Cependant, pour construire raccourcis efficaces, les pairs ont besoin d'avoir des informations sur la topologie de l'overlay. En cas de distributions de pairs hétérogènes, la récupération de ces informations n'est pas simple. En outre, en raison du fort taux de connexion/déconnexion, la topologie évolue rapidement, ce qui rend vite les informations recueillies obsolètes. Les systèmes de l'état de l'art, soit évitent le problème en forçant les pairs à adopter une distribution uniforme, soit ne satisfont que partiellement les exigences de performances de routage. Pour faire face à ce problème, nous proposons DONUT, un mécanisme de construction d'une carte locale qui se rapproche de la distribution des pairs. Cette carte permet d'estimer localement, avec précision, la distance graphique avec les autres pairs. L'évaluation réalisée avec une matrice de latences réelles et des traces de connexion/connexion montre que notre carte augmente l'efficacité du routage d'au moins 20%, comparativement aux techniques de l'état de l'art. Elle montre également que chaque carte est petite et peut être propagé efficacement à travers le réseau en consommant moins de 10 bps sur chaque pair.

2024, Statistics & Probability Letters

The Hamburger and Stieltjes moment problem and the sequence of maximum entropy (MaxEnt) approximates with given moments are considered. MaxEnt approximate converges in entropy to the unknown distribution when the Hamburger (or Stieltjes)... more

The Hamburger and Stieltjes moment problem and the sequence of maximum entropy (MaxEnt) approximates with given moments are considered. MaxEnt approximate converges in entropy to the unknown distribution when the Hamburger (or Stieltjes) moment problem is determinate and the underlying distribution has finite entropy.

2024, Mammalian Biology

Dietary investigations of sympatric felids are means for understanding how closely related species deal with food resources in a potentially competitive scenario. The diets of the oncilla Leopardus tigrinus, the jaguarundi Puma... more

Dietary investigations of sympatric felids are means for understanding how closely related species deal with food resources in a potentially competitive scenario. The diets of the oncilla Leopardus tigrinus, the jaguarundi Puma yagouaroundi and the ocelot Leopardus pardalis were studied through the analysis of scats in Araucaria Pine Forest with Natural Grasslands of southern Brazil. Small mammals comprised the bulk of the diets of the three felids, followed by birds and reptiles. The smallest food-niche overlap index was 0.84, indicating that these felids shared an important portion of their food resources. Inter-species differences were detected in the consumption of the most frequent rodent prey; L. tigrinus was the only species that consumed all the most frequent rodent prey differently from the rate expected from their abundances. Although these findings suggest competitive interactions, with the oncilla being the most subordinate species, further experimental investigations are necessary to elucidate more precisely how these syntopic felids coexist. The effects of sample size and its influences on the evaluation of the diets of the felids, especially of the ocelot, are discussed. We compare our data to a previous study in the same area, to account for the possible influences of biased sampling and uneven distribution of food resources on the diet of the ocelot. The opportunistic feeding behavior and the abundance of their primary prey (cricetid rodents) seem to allow these small cats to be resilient despite severe anthropogenic disturbance in the study area. We further suggest guidelines for future studies in the study region, in order to understand the dynamics of mammalian carnivores demography.

2024, Springer Proceedings in Physics

We introduce an adaptive Monte Carlo method to calculate free energy differences for continuous systems. The method uses a biasing potential based on integrating the derivative of the Hamiltonian with respect to parameters of interest.... more

We introduce an adaptive Monte Carlo method to calculate free energy differences for continuous systems. The method uses a biasing potential based on integrating the derivative of the Hamiltonian with respect to parameters of interest. Tests on the two-dimensional Lennard-Jones fluid are used to demonstrate the efficiency of the method.