|
Automatic Music Transcription
Bibliography (1970-1998)
Bibliography accumulated
here might still be interesting for those working in the field of Automatic
Music Transcription.
Keywords: automatic
music transcription, wav to midi conversion, fundamental frequency detection,
pitch tracking, sound segmentation, note's onset detection, beat induction,
rhythm recognition, content based audio retrieval, score following, expressive
performance extraction, music perception.
Abbreviations:
CMJ
IEEE ASSP IEEE
JAES
JASA
JNMR
MP
|
Computer Music Journal
Transactions on Acoustics Speech and Signal Processing
Journal of Audio Engineering Society
Journal of the Acoustical Society of America
Journal of New Music Research
Music Perception
|
|
ICASSP
ICMC
ICMPC
IJCAI
|
International Conference on Acoustics Speech and Signal Processing
International Computer Music Conference
International Conference on Music Perception and Cognition
International Joint Conference on Artificial Intelligence
|
|
ACM
CCRMA
IRCAM
MIT
|
Association for Computing Machinery
Center for Computer Research in Musics and Acoustics
Institut Recherche Coordination Acoustique Musique
Massachussets Institute of Technology
|
Agon, C., G. Assayag,
J. Fineberg and C. Rueda (1994). Kant: a critique of pure quantification,
ICMC’94, pp. 52-59.
Allen, P.E. and R.B. Dannenberg
(1990). Tracking musical beats in real time, ICMC’90, pp. 140-143.
Andre-Obrecht, R. (1988).
A new statistical approach for the automatic segmentation of continuous speech
signals, IEEE ASSP 36(1).
Askenfelt, A. (1976).
Automatic notation of played music (status report), STL-QPSR 1/1976, pp. 1-11.
Askenfelt, A. and K.
Elenius (1977). Editor and search programs for music, STL-QPSR, 4/1977, pp.
9-12.
Askenfelt, A. (1979).
Automatic notation of played music: the VISA project, Fontes Artes Musicae,
Vol. XXVI/2, pp. 109-120.
Avitsur, E. (1993).
WATER: A workstation for automatic transcription of ethnic recordings, Computing
in Musicology 9, p. 77.
Bagshaw, P.C., S.M.
Hiller and M.A. Jack (1993). Enhanced pitch tracking and the processing of
f0 contours for computer aided intonation teaching, EuroSpeech’93, pp. 1003-1006.
Bilmes, J. (1993).
Timing is of the essence: perceptual and computational techniques for representing,
learning, and reproducing expressive timing in percusive rhythm, M.Sc. thesis,
MIT Media Laboratory.
Blackburn, S. and D.
DeRoure (1998). A tool for content based navigation of music, ACM Multimedia’98
- Electronic Proceedings.
Bobrek, M. (1996). Polyphonic
music segmentation using wavelet based pre-structured filter banks with improved
time-frequency resolution, Ph.D. Dissertation, 1996.
Bobrek, M. and D.B.
Koch (1997). A Macintosh based system for polyphonic music transcription,
ESEAM’97, The First Electronic Scientific and Engineering Applications of
the Macintosh Conference, 1997.
Bregman, A.S. (1990).
Auditory Scene Analysis: the Perceptual Organisation of Sound, MIT Press.
Brown, G.J. and M.
Cooke (1994). Perceptual grouping of musical sounds: a computational model,
JNMR 23(1), pp. 107-132.
Brown, J.C. and M. Puckette
(1989). Calculation of a “narrowed” autocorrelation function, JASA 85(4),
pp.1595-1601.
Brown, J.C., (1991).
Calculation of a Constant Q Spectral Transform, JASA 89, pp. 425-434.
Brown, J.C. and M.S.
Puckette (1993). A high resolution fundamental frequency determination based
on phase changes of the Fourier transform, JASA 94(2), pp. 662-667.
Brown, J.C. (1993).
Determination of the meter of musical scores by autocorrelation, JASA 94(4),
pp. 1953-1957.
Brown, J.C. and K.V.
Vaughn (1996). Pitch center of stringed instrument vibrato tones, JASA 100(),
pp. 1728-1735.
Cagle, R.T. (1996).
Music to MIDI: progress towards the automatic transcription of multi-timbral
musical signals into standard MIDI files, Ph.D. Dissertation, University of
Tennessee, 1996.
Cagle, R.T. and D.B.
Koch (1997). Progress towards the automatic transcription of musical recordings
into standard MIDI files, Electronic Scientific and Engineering Aplications
of the Macintosh Conference eSEAM’97.
Calway, A. (1989).
The multiresolution Fourier transform: a general purpose tool for image analysis,
PhD Thesis, Department of Computer Science, The University of Warwick, UK.
Cambouropoulos, E. (1998).
Musical parallelism and melodic segmentation, Proceedings of the XIIth Colloquium
on Musical Informatics, pp. 111-114.
Carey, M., E.S. Parris
and G.D. Tattersall (1997). Pitch Estimation of Singing for Re-Synthesis and
Musical Transcription, EuroSpeech’97, pp. 887-890.
Carreras, F., M. Leman
and D. Petrolino (1998). Extraction of music harmonic information using schema-based
decomposition, Proceedings of the XIIth Colloquium on Musical Informatics,
pp. 115-118.
Casajus-Quiros, F.J.
and P. Fernandez-Cid (1994). Real-time loose-harmonic matching fundamental
frequency estimation for musical signals, ICASSP’94, pp. 221-224.
Cerveau, L. (1994).
Segmentation de phrases musicales a partir de la fréquence fondamentale, Mémoire
DEA ATIAM, Université Paris 6.
Chafe, C., B. Mont-Reynaud
and L. Rush (1982) Toward an intelligent editor of digital audio: Recognition
of musical constructs, CMJ, 6(1), pp. 30-41.
Chafe, C., D. Jaffe,
K. Kashima, B. Mont-Reynaud and J. Smith (1985). Techniques for note identification
in polyphonic music, ICMC’85, pp. 399-405. Chafe, C. and D. Jaffe (1986).
Source separation and note identification in polyphonic music, ICASSP’86,
pp.
Chilton, E.H.S. and
B.G. Evans (1987). Performance comparison of five pitch determination algorithms
on the linear prediction residual of speech, EuroSpeech’87, pp. 403-406.
Chowning, J.M., L. Rush,
B. Mont-Reynaud, C. Chafe, A. Schloss and J. Smith (1984). Intelligent systems
for the analysis of digitized acoustic signals, Final report, Technical Report
STAN-M-15, Stanford University Department of Music.
Chowning, J.M. and
B. Mont-Reynaud (1986). Intelligent analysis of composite acoustic signals,
Technical Report STAN-M-36, Stanford University Department of Music.
Clynes, M. (1987). What
can a musician learn about music performance from newly discovered microstructure
principles (PM and PAS)?, In A. Gabrielson (ed.) Action and Perception in
Rhythm and Music, Royal Swedish Academy of Music, 55.
Cook, P.R. (1995). An
investigation of singer pitch deviation as a function of pitch and dynamics,
Thirteenth International Congress of Phonetic Sciences, pp. 202-205.
Coüasnon, B. and B.
Rétif (1995). Using a grammar for a reliable full score recognition system,
ICMC’95, pp. 187-194.
Coyle, E.J. and I. Shmulevich
(1998). A system for machine recognition of music patterns, ICASSP’98, pp.
3597-3600.
d’Allesandro, C. and
M. Castellengo (1991). Etudes, par la synthèse de la perception du vibrato
vocal dans les transitions de notes, Bulletin d’Audiophonologie 7, pp. 551-564.
d’Alessandro, C. and
M. Castellengo (1994). The pitch of short duration vibrato tones, JASA 95(3),
pp. 1617-1630.
Dannenberg, R.B. and
B. Mont-Reynaud (1987). Following an improvisation in real time, ICMC’87,
pp. 241-248.
Desain, P. and H. Honing
(1989). Quantization of musical time: A connexionist approach, CMJ 13(3),
pp.
Desain, P. and H. Honing
(1993). Time functions function better as functions of multiple times, CMJ
16(2), pp. 17-34.
Desain, P. (1993).
A connectionist and a traditional AI quantizer, symbolic versus sub-symbolic
models of rhythm perception, Contemporary Music Review 9, pp. 239-254 (http://www.nici.kun.nl/mmm/publications/list.html).
Desain, P. and H. Honing
(1994). Foot-tapping: a brief introduction to beat induction, ICMC’94, pp.
78-79 (http://www.nici.kun.nl/mmm/publications/list.html).
Desain, P. and H. Honing
(1994). Rule-based models of initial beat induction and an analysis of their
behavior, ICMC’94, pp. 80-82 (http://www.nici.kun.nl/mmm/publications/list.html).
Desain, P. and H. Honing
(1994). Does expressive timing in music performance scale proportionally with
tempo, Psychological Research 56, pp. 285-292 (http://www.nici.kun.nl/mmm/publications/list.html).
Desain, P. (1995). A
(de)composable theory of rhythm perception, MP 9, pp. 439-454.
Desain, P. and H. Honing
(1995). Towards algorithmic descriptions of continuous modulations of musical
parameters, ICMC’95, pp. 393-395 (http://www.nici.kun.nl/mmm/publications/list.html).
Desain, P. and H. Honing
(1996). Modeling continuous aspects of music performance: vibrato and portamento,
ICMPC’96 (http://www.nici.kun.nl/mmm/publications/list.html).
Desain, P. and al. (1997).
Robust score performance matching: taking advantage of structural information,
ICMC’97, pp. 337-340 (http://www.nici.kun.nl/mmm/publications/list.html).
Di Federico, R. and
G. Borin (1998). An improved pitch synchronous sinusoidal analysis-synthesis
method for voice and quasi-harmonic sounds, Proceedings of the XIIth Colloquium
on Musical Informatics, pp. 215-218.
Dixon, S.E. and D.M.W.
Powers (1996). The characterization, separation and transcription of complex
acoustic signals, Proceedings of the 6th Australian International Conference
on Speech Science and Technology, pp. 73-78.
Dixon, S. (1996). A
dynamic modeling approach to music recognition, ICMC’96, pp. 83-86.
Dixon, S. (1996). Multiphonic
note identification, Proceedings of the 19th Australasian Computer Science
Conference (? Australian Computer Science Communications 18(1)), pp. 318-323.
Dixon, S. (1997). Beat
induction and rythm recognition, Proceedings of the Australian Joint Conference
on Artificial Intelligence, pp. 311-320.
Dolson, M. (1986).
The phase vocoder, CMJ 10(4), pp. 14-27.
Doval, B. (1994). Estimation
de la Fréquence Fondamentale des Signaux Sonores, These de doctorat de l’Université
Paris VI.
Drake, C. and C. Palmer
(1993). Accent structures in music performance, MP 10(3), pp. 343-378.
Drioli, C. and G. Borin
(1998). Automatic recognition of musical events and attributes in singing,
Proceedings of the XIIth Colloquium on Musical Informatics, pp. 17-20.
Ellis, D.P.W. (1996).
Prediction-driven computational auditory scene analysis, Ph.D. Thesis, MIT.
(http://sound.media.mit.edu/papers.html#dpwe).
Fernandez-Cid, P. and
F.J. Casajus-Quiros (1998). Multi-pitch estimation for polyphonic musical
signals, ICASSP’98, pp. 3565-3568.
Foote, J.T. (1997).
Content based retrieval of music and audio, Proceedings of SPIE, Vol 3229,
pp. 138-147, (http://www.fxpal.com/people/foote/papers/index.htm).
Foote, J.T. (1997).
An overview of audio information retrieval, ACM - Springer, Multimedia Systems,
(http://www.fxpal.com/people/foote/papers/index.htm).
Forsberg, J. (1997).
Automatic conversion of sound to the MIDI-format, M.Sc. thesis, Department
of Speech, Music and Hearing, Royal Institute of Technology, Stockholm. Forsberg,
J. (1998). Automatic conversion of sound to the MIDI-format, TMH-QPSH, 1-2/1998,
pp. 53-60.
Foster, S. (1982). A
pitch synchronous segmenter for musical signals, ICASSP’82.
Foster, S. and A.J.
Rockmore (1982). Signal processing for the analysis of musics sound, ICASSP’82,
pp. 89-92.
Foster, S., W.A. Schloss
and A.J. Rockmore (1982). Toward an intelligent editor of digital audio: Signal
processing methods, CMJ, 6(1), pp. 42-51.
Ghias, A., J. Logan,
D. Chamberlin, B.C. Smith (1995). Query by humming - Musical information retrieval
in an audio database, ACM Multimedia’95 - Electronic Proceedings, (http://www.cs.cornell.edu/Info/Faculty/bsmith/query-by-humming.html).
Gold, B. and L. Rabiner
(1969). Parallel processing techniques for estimating pitch periods of speech
in the time domain, JASA, 46(2), pp. 442-448.
Goldstein, J.L., A.
Gerson, P. Srulovicz and M. Furst (1978). Verification of the optimal probabilistic
basis of aural processing in pitch of complex tones, JASA 63(2), pp. 486-497.
Gordon, J.W. (1987).
The perceptual attack transients in musical tones, JASA 82(1), pp. 88-105.
Gordon, J.W. (). Perception
of attack transients in musical tones, Technical Report STAN-M-17, Department
of Music, Stanford University.
Goto, M. and Y. Muraoka
(1995). A real-time beat tracking system for audio signals, ICMC’95, pp. 171-174.
Goto, M. and Y. Muraoka
(1996). Beat tracking based on multiple-agent architecture - a real-time beat
tracking system for audio signals, Proceedings of the 2nd International Conference
on Multiagent Systems, pp. 103-110, (http://staff.aist.go.jp/m.goto/publications.html).
Goto, M. and Y. Muraoka
(1997). Issues in evaluating beat tracking systems, IJCAI’97 Workshop on Issues
in Artificial Intelligence and Music - Evaluation and Assessment, pp. 9-16,
(http://staff.aist.go.jp/m.goto/publications.html)
Goto, M. and Y. Muraoka
(1997). Real-time rhythm tracking for drumless audio signals - chord change
detection for musical decisions, IJCAI’97 Workshop on CASA, pp. 135-144, (http://staff.aist.go.jp/m.goto/publications.html).
Goto, M. and Y. Muraoka
(1998). An audio-based real-time beat tracking system and its applications,
ICMC’98, pp. 17-20, (http://staff.aist.go.jp/m.goto/publications.html).
Goto, M. and Y. Muraoka
(1998). Music understanding at the beat level - real-time beat tracking for
audio signals, In Readings in CASA (eds. Rosenthal, D. and H. Okuno), Erlbaum,
Mahwah, NJ, pp. 157-176.
Grassi, M. (1998).
Mistuned scales, Proceedings of the XIIth Colloquium on Musical Informatics,
pp. 228-231.
Grubb, L. and R. Dannenberg
(1994). Automating ensemble performance, ICMC’94, pp. 63-69.
Handel, S. (1989).
Listening: An Introduction to the perception of Auditory Events. MIT Press,
Cambridge, Massachusetts.
Hashimoto, S., H. Qi
and D. Chang (1996). Sound database retrieved by sound, ICMC’96, pp. 121-123.
Hawley (1993). Structure
out of Sound, Ph.D. thesis, MIT.
Hermes, D. (1988).
Measurement of pitch by subharmonic summation, JASA 83(1), pp. 257-264.
Hess, W. (1983). Pitch
Determination of Speech Signals. Springer-Verlag, New York.
Honing, H. (1995). The
vibrato problem, comparing two solutions, CMJ 19(3).
Inoue, W., S. Hashimoto
and S. Ohteru (1993). A computer music system for human singing, ICMC’93,
pp. 150-153.
Iwamiya, S., T. Miyakura
and N. Satoh (1989). Perceived pitch of complex FM-AM tones, ICMPC’89, pp.
431-436.
Kageyama, T., K. Mochizuki
and Y. Takashima (1993). Melody retrieval with humming, ICMC’93, pp. 349-351.
Kapadia, J.H. (1995).
Automatic recognition of musical notes, M.Sc. thesis, University of Toledo.
Kapadia, J.H. and J.F.
Hemdal (1995). Automatic recognition of musical notes, JASA 98(5), p. 2957.
Kashino, K. and H.
Tanaka (1993). A sound source separation system with the ability of automatic
tone modeling, ICMC’93, pp. 248-255.
Kashino, K., K. Nakadai,
T. Kinoshita, H. Tanaka (1995). Application of Bayesian probability network
to music scene analysis, Working notes of the IJCAI’95 Computational Audio
Scene Analysis workshop.
Kashino, K., K. Nakadai,
T. Kinoshita, H. Tanaka (1995). Organization of Hierarchical Perceptual Sounds,
IJCAI’95, pp. 158-164.
Kashino, K. and H. Murase
(1998). Music Recognition using note transition context, ICASSP’98, pp. 3593-3596.
Katayose, H. and S.
Inokuchi (1989). The Kansei music system, CMJ 13(4), pp. 72-77.
Katayose, H., T. Kanamori,
K. Kamei, Y. Nagashima, K. Sato, S. Inokuchi and S. Simura (1993). Virtual
performer, ICMC’93, pp. 138-145.
Katayose, H. and S.
Inokuchi (1993). Learning performance rules in a music interpretation system,
Computers and Humanities 27(1), pp. 31-40.
Katayose, H. and S.
Inokuchi (1995). A model of pattern processing for music, ICMC’95, pp. 505-506.
Keislar, D., T. Blum,
J. Wheaton and E. Wold (1995). Audio analysis for content-based retrieval,
ICMC’95, pp. 199-202.
King, J.-B. and Y.
Horii (1993). Vocal matching of frequency modulation in synthesised vowels,
Journal of Voice 7, pp. 151-159.
Klapuri, A. (1997).
Automatic Transcription of Music, M.Sc. Thesis, Department of Information
Technology, Tampere University of Technology, Finland. (http://www.cs.tut.fi/~klap/iiro/contents.html).
Klapuri, A. (1998).
Number theoretical means of resolving a mixture of several harmonic sounds,
Proceedings of the European Signal Processing Conference EUSIPCO’98. http://www.cs.tut.fi/~klap/iiro/).
Klapuri, A. (1999).
Sound onset detection by applying psychoacoustic knowledge, ICASSP’99, (http://www.cs.tut.fi/~klap/iiro/).
Kronland-Martinet,
R, J. Morlet and A. Grossmann (1987). Analysis of sound patterns through wavelet
transforms, International Journal of Pattern Recognition and Artificial Intelligence,
2, pp. 97-126.
Krumhansl, C.L. (1991(90)).
Cognitive Foundations of Musical Pitch. Oxford University Press, Oxford (New
York).
Kuhn, W.B. (1990). A
real-time pitch recognition algorithm for music applications, CMJ 14(3), pp. 60-71.
Large, E.W. and J.F. Kolen (1994). Resonance and the perception of
musical meter, Connection Science 6, pp. 177-208.
Large, E.W. (1995).
Beat tracking with a nonlinear oscillator, IJCAI Workshop on Artificial Intelligence
and Music.
Lee, C.S. (1986). The
rhythmic interpretation of single musical sequences: towards a perceptual
model, In Musical Structure and Cognition (ed. Howell, P., I. Cross and R.West),
pp. 53-69.
Lerdahl, F and R. Jackendoff
(1983). A Generative Theory of Tonal Music. MIT Press, Cambridge, Massachusetts.
Longuet-Higgins, H.C.
(1976). Perception of melodies, Nature 263/5579, pp. 646-653.
Longuet-Higgins, H.C.
(1978). The perception of music, Interdisciplinary Science Reviews 3(2), pp.
148-156.
Longuet-Higgins, H.C.
and C.-S. Lee (1982). The perception of musical rhythms, Perception 11(),
pp. 115-128.
Longuet-Higgins, H.C.
and C.S. Lee (1984). The rhythmic interpretation of monophonic music, MP 1,
pp. 424-441.
Longuet-Higgins, H.C.
(1987). Mental Processes, MIT Press.
Lunney, H.W.M. (1974).
Time as heard in speech and music, Nature 249, p. 592. Maher, R.C. (1989).
An Approach for the Separation of Voices in Composite Musical Signals, Ph.D.
thesis, University of Illinois, Urbana-Champaign.
Maher, R.C. (1990).
Evaluation of a Method for Separating Digitized Duet Signals, JAES 38(12),
pp. 956-979.
Marcus, S.M. (1981)
Acoustic determinants of perceptual center (P-center) location, Perception
and Psychophysics 30(3), pp. 247-256.
Markel, J.D. and A.H.
Gray Jr. (1976). Linear Prediction of Speech, Springer-Verlag, New York.
Marolt, M. (1997).
A music transcription system based on multiple-agents architecture, Proceedings
of Multimedia and Hypermedia Systems Conference MIPRO’97 Opatija, Croatia,
(http://lgm.fri.uni-lj.si/~matic/).
Marolt, M. (1998).
Feedforward neural networks for piano music transcription, Proceedings of
the XIIth Colloquium on Musical Informatics, pp. 240-243.
Martin (1996). A Blackboard
System for Automatic Transcription of Simple Polyphonic Music. MIT Media Laboratory
Perceptual Computing Section Technical Report No. 385. (http://xenia.media.mit.edu/~kdm//professional.html).
Martin (1996). Automatic
transcription of simple polyphonic music: robust front end processing. MIT
Media Laboratory Perceptual Computing Section Technical Report No. 399. (http://xenia.media.mit.edu/~kdm//professional.html).
Martin, K.D., E.D.
Scheier and B.L. Vercoe (1998). Musical context analysis through models of
audition, Proceedings ACM Multimedia Workshop on Content Processing for Multimedia
Applications, (http://xenia.media.mit.edu/~kdm//professional.html).
McAdams, S. and A. Bregman
(1979). Hearing musical streams, CMJ 3(4), pp. 26-43.
McAdams, S. (1996).
Audition: cognitive psychology of music in The Mind-Brain Continuum (Eds.
R. Llinas, P. Churchland), MIT Press, 1996, pp. 251-279.
McNab, R., L.A. Smith
and I.H. Witten (1995). Signal processing for melody transcription, Working
paper 95/22, University of Waikato, Hamilton, New Zealand.
McNab, R. (1996). Interactive
applications of music transcription, M.Sc. thesis, University of Waikato -
New Zealand.
McNab, R.J., L.A. Smith,
I.H. Witten, C.L. Henderson and S.J. Cunningham (1996). Towards the digital
music library: tune retrieval from acoustic input, Proceedings of ACM Digital
Libraries’96, pp. 11-18.
McNab, R.J., L.A. Smith,
D. Bainbridge and I.H. Witten (1997). The New Zealand digital library melody
index, D-Lib Magazine (http://www.dlib.org/dlib/may97/meldex/05witten.html).
Medan, Y., E. Yair
and D. Chazan (1991). Super resolution pitch determination of speech signals,
IEEE ASSP 39(1), pp. 40-48.
Meddis, R. and M.J.
Hewitt (1991). Virtual pitch and phase sensitivity of a computer model of
the auditory periphery. I: pitch identification, JASA 89(6), pp. 2866-2882.
Mellinger, D.K. and
B. Mont-Reynaud (1991). Sound explorer: A workbench for investigating source
separation, ICMC’91, pp. 90-94.
Mellinger, D.K. (1991).
Event Formation and Separation in Musical Sounds. Ph.D. thesis, Dept. of Computer
Science, Stanford University, (ftp://ccrma-ftp.stanford.edu/pub/Publications/Theses).
Michon, J.A. (1964).
Studies on subjective duration: I Differential sensitivity in the perception
of repeated temporal intervals, Acta Psychologica 22, pp. 441-450.
Mitchell, T.M. (1997).
Machine Learning. McGraw-Hill International Editions. Moelants, D. and C.
Rampazzo (1997). A computer system for the automatic detection of perceptual
onsets in a musical signal, In KANSEI, the technology of emotion (ed. Camurri,
A.), pp. 140-146.
Mont-Reynaud, B. (1985).
Problem-solving Strategies in a Music Transcription System, IJCAI’85, pp.
916-918.
Mont-Reynaud, B. and M. Goldstein (1985). On finding rhythmic patterns
in musical lines, ICMC’85, pp. 391-397.
Mont-Reynaud, B. and
D.K. Mellinger (1989). A computational model of source separation by frequency
co-modulation, Proceedings of the First International Conference on Music
Perception and Cognition, pp. 99-102.
Mont-Reynaud, B. and
E. Gresset (1990). PRISM: Pattern recognition in sound and music, ICMC’90,
pp. 153-155.
Mont-Reynaud, B. (1992).
Machine hearing research at CCRMA: An overview, CCRMA Research Overview, Department
of Music, Stanford University, pp. 24-32, (ftp://ccrma-ftp.stanford.edu/pub/Publications/).
Moore (ed.) (1995).
Hearing. Handbook of Perception and Cognition (2nd edition), Academic Press
Inc.
Moore, B., B. Glasberg
and T. Baer (1997). A model for the prediction of thresholds, loudness and
partial loudness, JAES 45(4), pp. 224-240.
Moorer, J.A. (1975).
On the segmentation and analysis of continuous musical sound by digital computer,
Ph.D. thesis, Department of Computer Science, Stanford University.
Moorer, J.A. (1977).
On the transcription of musical sound by computer, CMJ, 1(4), pp. 32-38. Moorer,
J.A. (1978). The use of the linear prediction of speech in computer music
applications, JAES, 27(3), pp. 134-140.
Moorer, J.A. (1984).
Algorithm design for real-time audio signal processing, ICASSP’84, pp. 12.B.3.1-12.B.3.4.
Moreno, E.I. (1992).
The existence of unexplored dimensions of pitch: Expanded chromas, ICMC’92.
Morton, J., S.M. Marcus
and C. Frankish (1976). Perceptual centers (P-centers). Psychological Review
83(5), pp. 405-408.
Nakajima, Y. G. ten
Hoopen and R. van der Wilk (1991). A new illusion of time perception, MP 8,
pp. 431-448.
Nakamura, Y. and S.
Inokuchi (1979). Music information processing system in application to comparative
musicology, IJCAI’79, pp. 633-635.
Ng, K., R. Boyle and
D. Cooper (1996). Automatic detection of tonality using note distribution,
JNMR 25(4), pp. 369-381.
Niihara, T. and S. Inokuchi
(1986). Transcription of sung song, ICASSP’86, pp. 1277-1280.
Noll, A.M. (1967).
Cepstrum pitch determination, JASA 41(2), pp. 293-309.
Nunn, D. (1994). Source
separation and transcription of polyphonic music.
Parncutt, R. (1994).
A perceptual model of pulse salience and metrical accent in musical rhythms,
MP 11, pp. 409-464.
Patterson, R.D. and
J. Holdsworth (1990). An introduction to auditory sensation processing, In
HAM HAP 1(1).
Pearson, E.R.S. and
R.G. Wilson (1990). Musical event detection from audio signals within a multiresolution
framework, ICMC’90, pp. 156-158.
Pearson, E.R.S. (1995).
The multiresolution Fourier transform and its application to polyphonic audio
analysis, Technical Report CC-RR-282, University of Warwick.
Phillips, M.S. (1985).
A feature-based time domain pitch tracker, JASA 77, S9-S10(k).
Pielemeier, W.J. and
G.H. Wakefield (1996). A high-resolution time-frequency representation for
musical instrument signals, JASA 99(4), pp. 2382-2396.
Pierce, J.R. (1991).
Periodicity and pitch perception, JASA, 90, pp. 1889-1893. Piszczalski, M.
(1977). Automatic music transcription, CMJ 1(4), pp.24-31.
Piszczalski, M. and
B. Galler (1979). Predicting musical pitch from component frequency ratios,
JASA 66(3), pp. 710-720.
Piszczalski, M. and
B. Galler (1979). Computer Analysis and Transcription of Performed Music:
a Project Approach, Computers and the Humanities 13, pp. 195-206.
Piszczalski, M., B.
Galler, R. Bossemeyer, M. Hatamian and F. Looft (1981). Performed music: analysis,
synthesis, and display by computer, JAES 29(1/2), pp. 38-46.
Piszczalski, M. (1986).
A computational model for music transcription, Ph.D. thesis, University of
Stanford.
Pollastri, E. (1998).
Melody-retrieval based on pitch-tracking and string-matching methods, Proceedings
of the XIIth Colloquium on Musical Informatics.
Povel, D.-J. and H.
Okkenman (1981). Accents in equitone sequences, Perception & Psychophysics
30(6), pp. 565-572.
Povel, D.-J. and P.
Essens (1985). Perception of temporal patterns, MP 2, pp. 411-440.
Prame, E. (1984). Measurements
of the vibrato rate of ten singers, JASA 96(4), pp. 1979-1984.
Pressing, J. and P.
Lawrence (1993). Transcribe: A comprehensive autotranscription program, ICMC’93,
pp. 343-345.
Privosnik, M. and M.
Marolt (1998). A system for automatic transcription of music based on multiple
agents architecture, Proceedings of MELECON’98, Tel Aviv, Israel, pp. 169-172.
Proakis, J. and D. Manolakis
(1996). Digital Signal Processing. 3rd ed. Englewood Cliffs, NJ: Prentice-Hall.
Puckette, M. (1995).
Score following using the sung voice, ICMC’95, pp. 175-178. Rabiner, L.R.,
and B. Gold (1975). Theory and Applications of Digital Signal Processing.
Englewood Cliffs:
Prentice-Hall. Rabiner,
L.R., M.J. Cheng, A.E. Rosenberg and C.A. McGonegal (1976). A comparative
performance study of several pitch detection algorithms, IEEE Transactions
on Acoustics, Speech and Signal Processing, 24(5), pp. 399-418.
Rabiner, L.R. (1977).
On the use of autocorrelation analysis for pitch detection, IEEE Transactions
on Acoustics, Speech and Signal Processing, 25(1), pp. 24-33.
Raskinis, G. (1998).
Preprocessing of folk song acoustic records for transcription into music scores,
Informatica 9(3), pp. 343-364.
Raskinis, G. (2000).
Automatic Transcription of Lithuanian Folk Songs, PhD. Thesis, Vytautas Magnus
University, Kaunas, Lithuania.
Remmel, M., I. Ruutel,
J. Sarv and R. Sule (1975). Automatic notation of one-voiced song, Academy
of Sciences of the Estonian SSR, Institute of Language and Literatureg, Preprint
KKI-4, (Ed. Ü.Tedre), Tallin, Estonia.
Repp, B.H. (1994). On
determining the basic tempo of an expressive music performance, Psychology
of Music 22, pp. 157-167.
Roads, C. (1996). The
Computer Music Tutorial, . MIT Press, Cambridge, Massachussets.
Roberts, S.C. and M.
Greenhough (1995). Rhytmic pattern processing using a self organising neural
network, ICMC’95, pp. 412-419.
Rodet, X. and S. Rossignol
(1998). Automatic characterization of musical signals: feature extraction
and temporal segmentation, ACM Multimedia’98.
Rolland, P.-Y. (1998).
Découverte automatique de regularités dans les sequences et application ?
l’analyse musicale, These de doctorat de l’Université Paris VI.
Rolland, P.-Y., G.
Raskinis and J.-G. Ganascia (1999). Musical Content-Based Retrieval: an Overview
of the Melodiscov Approach and System, ACM Multimedia’99, pp. 81-84.
Rosenthal, D. (1992).
Emulation of human rhythm perception, CMJ 16, pp. 64-76. Rosenthal, D. (1992).
Intelligent rhythm tracking, ICMC’92, pp. 227-230.
Rosenthal, D. (1992).
Machine rhythm: computer emulation of human rhythm perception, MIT Media Laboratory,
Ph.D. Thesis.
Rosenthal, D., M. Goto
and Y. Muraoka (1994). Rhythm tracking using multiple hypothesis, ICMC’94,
pp. 85-88, (http://staff.aist.go.jp/m.goto/publications.html).
Rossignol, S. (1997).
Segmentation - Extraction du vibrato: Premier rapport d’activité, Rapport
de stage: Premier rapport d’activité de th?se, jan 1997.
Rossignol, S., X. Rodet,
J. Soumagne, J.-L. Colette and P. Depalle. (1998). Feature extraction and
temporal segmentation of acoustic signals, ICMC’98, (http://mediatheque.ircam.fr/articles/textes/Rossignol98a/).
Scarborough, D.L.,
B.O. Miller and J.A. Jones (1989). Connectionist models for tonal analysis,
CMJ 13(3), pp. 49-55.
Schaffer, R.W. and L.R.
Rabiner (1973). A digital signal processing approach to interpolation. Proc.
IEEE 61, pp. 692-702.
Schloss, A.W. (1985).
On the automatic transcription of percussive music: From acoustic signal to
high-level analysis. Ph.D. Thesis, Department of Hearing and Speech, Stanford
University.
Schroeder, M.R. (1968).
Period histogram and product spectrum: new methods for fundamental frequency
measurement, JASA, 43(4), pp. 829-834.
Secrest, B.G. and G.R.
Doddington (1982). Postprocessing techniques for voice pitch trackers, ICASSP’82,
pp 172-175.
Secrest, B.G. and G.R.
Doddington (1983). An integrated pitch tracking algorithm for speech systems,
ICASSP’83, pp. 1352-1355.
Scheier, E.D. (1995).
Extracting expressive performance information from recorded music, M.Sc. thesis,
Program in Media Arts and Science, MIT, 1995, (http://sound.media.mit.edu/papers.html#eds).
Scheier, E.D. (1995).
Using musical knowledge to extract expressive performance information from
audio recordings, IJCAI’95 Workshop on Computational Auditory Scene Analysis,
pp. 153-160, (http://sound.media.mit.edu/papers.html#eds).
Scheier, E.D. (1996).
Bergman’s chimerae: music perception as auditory scene analysis, 4th ICMPC,
(http://sound.media.mit.edu/papers.html#eds).
Scheier, E.D. (1997).
Pulse tracking with a pitch tracker, Proceedings of the 97 IEEE Workshop on
Applications of Signal Processing to Audio and Acoustics, (http://sound.media.mit.edu/papers.html#eds).
Scheier, E.D. (1998).
Tempo and beat analysis of acoustic musical signals, JASA 103(1), pp. 588-601.
Seashore, C.E. (1967).
Psychology of Music, New York: Dover
Shepard, R.N. (1982). Structural representation
of musical pitch, In P.Deutch (Ed.), The psychology of music, New York: Academic
Press, pp. 343-390.
Shepard, R.N. and D.S.
Jordan (1984). Auditory illusions demonstrating that tones are assimilated
to an internalized musical scale, Science 226, pp. 1333-1334.
Shmulevich, I. and E.J.
Coyle (1997).
Establishing the tonal context for musical pattern recognition, Proceedings
of the 1997 IEEE Workshop on Applications of Signal Processing to Audio and
Acoustics.
Shmulevich, I. and
D. Povel. (1998). Rhythm complexity measures for music pattern recognition,
Proceedings of the IEEE Workshop on Multimedia Signal Processing.
Shuttleworth, T. and
R.G. Wilson (1993). Note Recognition in Polyphonic Music using Neural Networks,
Technical Report CS-RR-252, University of Warwick (ftp://ftp.dcs.warwick.ac.uk/reports/rr/252/).
Shuttleworth, T. and
R.G. Wilson (1995). A neural network for triad classification, ICMC’95, pp.
428-431.
Smith, L.S. (1994).
Sound segmentation using onsets and offsets, Journal of New Music Research
23, pp. 11-23.
Smith, L.S. (1993).
Temporal localisation and simulation of sounds using onsets and offsets, CCCN
Technical Report CCCN-16, University of Stirling.
Stainsby, T. (1996).
A system for the separation of simultaneous musical audio signals, ICMC’96,
pp. 75-78.
Stautner, J.(A.) (1982(83)).
The auditory transform (Analysis and Synthesis of Music Using the Auditory
Transform)), M.Sc. thesis, Department of Electrical Engineering and Computer
Science, MIT, 1982.
Sterian, A. and G.H.
Wakefield (1996). Robust automated music transcription systems, ICMC’96, pp.
219-221.
Steedman, M.J. (1977).
The perception of musical rhythm and metre, Perception 6, pp. 555-569.
Strawn, J.M. (1980).
Approximations and syntactic analysis of amplitude and frequency functions
for digital sound synthesis, CMJ 4(3), pp. 3-24.
Sundberg, J. and P.
Tjernlund (1970). A computer program for the notation of played music, STL-QPSR
2-3/1970, pp. 46-49.
Sundberg, J. (1987).The
Science of the Singing Voice, Northern Illinois University Press, Dekalb,
Illinois.
Sundberg, J. (1991).
The Science of Musical Sounds, Academic Press.
Tait, C. (1995). Audio analysis
for rhytmic structure, ICMC’95, pp. 590-591.
Tait, C. and W. Findlay
(1996). Wavelet analysis for onset detection, ICMC’96, pp. 500-503.
Tanguiane, A. (1991). Criterion of data complexity in rhythm recognition, ICMC’91, pp.
559-562.
Tanguiane, A.S. (1993).
Artificial Perception and Music, Springer-Verlag.
Tanguiane, A. (1994).
A principle of correlativity of perception and its application to music recognition,
MP 11(4), pp. 465-502.
Taylor, I.J. and M.
Greenhough (1995). Neural network pitch tracking over the pitch continuum,
ICMC’95, pp.432-435.
Terhardt, E. (1974).
Pitch, consonance, and harmony, JASA 55(5), pp. 1061-1069.
Thomassen, J.M. (1982).
Melodic accent: experiments and a tentative model, JASA 71(6), pp. 1596-1605.
Todd, N. (1994). The
auditory “primal sketch”: a multiscale model of rhythmic grouping, JNMR 23,
pp. 25-70.
Toiviainen, P. (1998).
Intelligent jazz accompanist: a real-time system for recognizing, following,
and accompanying musical improvisations, Proceedings of the XIIth Colloquium
on Musical Informatics, pp. 101-104.
Tuerk, C.M. (1990).
A Text-to-Speech system based on NETtalk, M.Sc. Thesis, Engineering Department,
Cambridge University, (http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/speech/systems/pt/0.html).
Uitdenbogerd, A.L.
and J. Zobel (1998). Manipulation of music for melody matching, ACM Multimedia’98
- Electronic Proceedings.
Vantomme, J.D. (1995).
The induction of musical structure using correlation, ICMC’95, pp. 585-586.
Vantomme, J.D. (1995).
Score following by temporal pattern, CMJ 19(3), pp. 50-59. Vercoe, B.L. (1994).
Perceptually-based music
pattern recognition and response ICMPC’94, pp.
Wightman, F. (1973).
The pattern transformation model of pitch, JASA 54(2), pp. 407-416.
Wilson, R.G. and E.R.S.
Pearson (1989). A multiresolution signal representation and its application
to the analysis of musical signals, ICMC’89.
Wilson, R.G. and T.
Shuttleworth (1995). The recognition of musical structures using neural networks,
IJCAI’95.
Wöhrmann, R. and L.
Solbach (1995). Preprocessing for the automated transcription of polyphonic
music: linking wavelet theory and auditory filtering, ICMC’95, pp.396-399.
Wold, E., T. Blum, D.
Keislar and J. Wheaton (1996). Content-based classification, search, and retrieval
of audio, IEEE Multimedia, 3(3), pp. 27-36, (ftp://ftp-db.deis.unibo.it/pub/ibartolini/Courses/Papers/CBClassSrch&RetrOfAudio.pdf).
Zwicker, E. (1977).
Procedure for calculating loudness of temporally variable sounds, JASA 62(3),
pp. 675-682.
to top
|